Building a Data Analyst AI
My day job requires a lot of data engineering, like a lot. Sometimes I need to write quick nested SQL, wrangle some CSVs, maybe parse JSON. ChatGPT is great but it can't run and test the code, it can't do multi-turn conversations on my own data model.
I use DuckDb for ad-hoc tasks, its fast, can load any kind of data and works out of the box.
Normally it would take me anywhere between an hour to 3 for such tasks, so I've automated with with 15 lines of code. Here's an example:
import json
from phi.assistant.duckdb import DuckDbAssistant
duckdb_assistant = DuckDbAssistant(
semantic_model=json.dumps({
"tables": [
{
"name": "movies",
"description": "Contains information about movies from IMDB.",
"path": "https://phidata-public.s3.amazonaws.com/demo_data/IMDB-Movie-Data.csv",
}
]
}),
)
duckdb_assistant.print_response("What is the average rating of movies? Show me the SQL.")
duckdb_assistant.print_response("What is the revenue per year?")
Read more about the DuckDbAssistant
Here's it in action: