首页
学习
活动
专区
圈层
工具
发布
社区首页 >专栏 >使用 Vanna + Ollama + ChromaDB,实现 自然语言查询 / ChatBI

使用 Vanna + Ollama + ChromaDB,实现 自然语言查询 / ChatBI

作者头像
Michael阿明
发布2026-03-25 13:55:32
发布2026-03-25 13:55:32
3400
举报

文章目录

  • 1. 解决的核心问题是什么?
    • 1. 核心问题与解决方案
  • 2. 实现原理
    • 1. RAG 架构:向量检索 + 生成模型
    • 2. 执行与自我学习
    • 3. 可视化输出(可选)
    • 4. 技术优势详析
    • 5. 总结
  • 3. 实战案例
    • 3.1 清理目录 + 构造 SQLite 示例库
    • 3.2 Vanna 入口类:`MyVanna`
    • 3.3 训练三件套
    • 3.4 ask():自然语言 → SQL → DataFrame \[→ Plotly]
      • 问题 1:跨年总营业额对比
      • 问题 2:月度环比增速
  • 4.  结语

1. 解决的核心问题是什么?

Vanna AI 的产品核心 解决了「自然语言到 SQL 查询」之间的鸿沟,使非技术人员也能轻松获得数据库中的洞察

1. 核心问题与解决方案

  1. 门槛高的 SQL 查询编写 SQL 往往需要专业技能,会对很多业务人员造成阻碍。 Vanna AI 允许用户使用普通语言提问,例如 “上季度各地区销售额是多少?”,系统自动生成 SQL 并执行,将结果反馈出来
  2. 提高分析效率,释放专业人力通过自然语言接口,业务用户可以直接自主查询,大幅减少对数据团队支持的依赖,加快洞察获取速度
  3. 复杂查询自动化与自学习能力支持多表联查、子查询、聚合等复杂语句,利用 Retrieval-Augmented Generation (RAG) 帮助 LLM 理解数据库结构与上下文,从而生成准确的 SQL,效果随训练数据越多逐步提升
  4. 数据安全与灵活部署Vanna AI 支持本地部署或集成到自家基础设施,所有 SQL 在本地执行,仅传 schema 和训练元数据给 LLM,不会泄露真实数据给云端

Vanna AI 的目标是实现 无需写 SQL,也能直接“聊天式”访问数据库的能力,从而提升数据分析效率与业务自主性,同时保障数据安全和系统灵活性。

2. 实现原理

Vanna AI 的核心技术是将 Retrieval‑Augmented Generation(检索增强生成,简称 RAG)大型语言模型(LLM) 结合,形成一个能将自然语言转 SQL 的智能系统。以下是它的实现原理:

在这里插入图片描述
在这里插入图片描述

在这里插入图片描述

1. RAG 架构:向量检索 + 生成模型

  • 训练阶段(vn.train(...)
    • 将数据库的 DDL(表结构定义)、字段说明、示例 SQL,以及用户历史问题–SQL 对添加到向量数据库(如 ChromaDB、PgVector 等)中
    • 每条内容会被编码为 embedding 形式,供后续检索使用 。
  • 查询阶段(vn.ask(...)
    • 用户输入自然语言问题,系统将其编码为向量后在向量库中进行相似检索,找到最相关的 DDL、文档和示例 SQL
    • 将检索到内容作为提示(prompt)上下文输入到 LLM(支持 GPT、Anthropic、Gemini 等),由 LLM 生成对应 SQL 查询

2. 执行与自我学习

  • 系统运行生成的 SQL,通过数据库执行验证其正确性。
  • 成功执行后,该问题-查询对会自动反馈回训练数据池,进一步丰富向量库;若失败,则需人工修正并加入训练

3. 可视化输出(可选)

  • 在 Jupyter 或前端界面下,系统还可以调用 LLM 生成 Plotly 可视化代码,将结果绘图并展示给用户 。

4. 技术优势详析

  • 上下文丰富,提升准确率:通过检索历史问答、schema 信息等内容,LLM 在有限 token 上更具针对性,复杂查询生成更准确 。
  • 自适应学习:系统不断将有效的实例加入训练,以促进模型精进
  • 轻量部署与跨模型兼容:采用 RAG 不需对 LLM 微调,只需替换模型与向量库,可自由选择 GPT、Gemini、Anthropic,甚至 Ollama 等
  • 数据安全:实际 SQL 生成与执行,仅在本地环境完成,向 LLM 或向量库传输的只是 schema 和示例内容,不泄露真实数据内容 。

5. 总结

Vanna AI 本质是一个通过向量检索 + LLM 推理构建的 可训练、可执行、可可视化的自然语义 SQL 助手。核心特色包括:

  • 基于用户数据自定义训练 RAG
  • 拥有自动反馈学习机制
  • 支持多模型、向量库、数据库
  • 向业务人员提供图形化呈现选项
  • 严控数据隐私,全部在用户端执行

3. 实战案例

SQLite + ChromaDB + Ollama(Qwen2.5‑1.5B)端到端示例

目标: 在本地快速生成一套“部门月度营业额”示例数据库; 给 Vanna AI 喂入 DDL、文档说明与多条示例 SQL; 用中文自然语言提出问题,让 Vanna AI 自动输出 SQL 与 Plotly 图;

代码语言:javascript
复制
pip install 'vanna[chromadb,ollama]' plotly pandas sqlite3

实战代码

代码语言:javascript
复制
import sqlite3
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
import random
from vanna.ollama import Ollama
from vanna.chromadb import ChromaDB_VectorStore

# … ① 目录清理 + 示例库构建
# … ② MyVanna 子类:同时继承 ChromaDB 向量库与 Ollama LLM
# … ③ 训练:DDL / 文档 / 示例 SQL
# … ④ 两个自然语言问题:总营业额对比、月度增长率
#     → ask() 生成 SQL + DataFrame (+ Plotly 代码)
# … ⑤ 将 fig.write_html / write_image 保存到本地

db_name = 'example.db'

# 先清理已有的ChromaDB数据
import shutil
import os
if os.path.exists('chroma.sqlite3'):
    os.remove('chroma.sqlite3')
if os.path.exists(db_name):
    os.remove(db_name)
# 清理ChromaDB相关文件夹
for item in os.listdir('.'):
    if os.path.isdir(item) and len(item) == 36 and '-' in item:  # UUID格式文件夹
        shutil.rmtree(item)

# 创建本地数据库并插入示例数据
def create_sample_database(db_name='example.db'):
    conn = sqlite3.connect(db_name)
    cursor = conn.cursor()
    
    # 创建部门信息表
    cursor.execute('''
    CREATE TABLE IF NOT EXISTS departments (
        dept_id INTEGER PRIMARY KEY,
        dept_name TEXT NOT NULL,
        manager_name TEXT NOT NULL,
        location TEXT NOT NULL
    )
    ''')
    
    # 创建营业额表
    cursor.execute('''
    CREATE TABLE IF NOT EXISTS revenue (
        id INTEGER PRIMARY KEY AUTOINCREMENT,
        dept_id INTEGER,
        year INTEGER NOT NULL,
        month INTEGER NOT NULL,
        revenue_amount DECIMAL(12,2) NOT NULL,
        FOREIGN KEY (dept_id) REFERENCES departments (dept_id)
    )
    ''')
    
    # 插入部门数据
    departments_data = [
        (1, '销售部', '张经理', '北京'),
        (2, '市场部', '李经理', '上海'),
        (3, '技术部', '王经理', '深圳'),
    ]
    
    cursor.executemany('INSERT OR REPLACE INTO departments (dept_id, dept_name, manager_name, location) VALUES (?, ?, ?, ?)', departments_data)
    
    # 插入2023年和2024年的营业额数据
    revenue_data = []
    
    # 为每个部门生成两年的月度营业额数据
    for dept_id in [1, 2, 3, ]:
        # 2023年数据
        base_revenue_2023 = {1: 1500, 2: 1200, 3: 1800,}[dept_id]
        for month in range(1, 13):
            # 添加一些季节性变化和随机波动
            seasonal_factor = 1 + 0.1 * (month % 4 - 2) / 2
            random_factor = 1 + random.uniform(-0.2, 0.2)
            amount = base_revenue_2023 * seasonal_factor * random_factor
            revenue_data.append((dept_id, 2023, month, int(amount)))
        
        # 2024年数据 - 有些部门增长,有些下降
        growth_rates = {1: -0.05, 2: 0.08, 3: 0.12,}  # 销售部和运营部下降
        base_revenue_2024 = base_revenue_2023 * (1 + growth_rates[dept_id])
        
        for month in range(1, 13):
            # 2024年月度增长趋势
            monthly_growth = growth_rates[dept_id] / 12 * month
            seasonal_factor = 1 + 0.1 * (month % 4 - 2) / 2
            random_factor = 1 + random.uniform(-0.15, 0.15)
            amount = base_revenue_2024 * (1 + monthly_growth) * seasonal_factor * random_factor
            revenue_data.append((dept_id, 2024, month, int(amount)))
    
    cursor.executemany('INSERT OR REPLACE INTO revenue (dept_id, year, month, revenue_amount) VALUES (?, ?, ?, ?)', revenue_data)
    
    conn.commit()
    # 展示数据库里面的销售数据
    df_revenue = pd.read_sql_query("SELECT * FROM revenue order by dept_id, year, month", conn)
    print("营业额数据:")
    for index, row in df_revenue.iterrows():
        print(f"部门ID: {row['dept_id']}, 年份: {row['year']}, 月份: {row['month']}, 营业额: {row['revenue_amount']}")
    conn.close()
    
    print("数据库创建完成,已插入示例数据")

# 创建示例数据库
create_sample_database(db_name)

class MyVanna(ChromaDB_VectorStore, Ollama):
    def __init__(self, config=None):
        ChromaDB_VectorStore.__init__(self, config=config)
        Ollama.__init__(self, config=config)


vn = MyVanna(config={'model': 'qwen2.5:1.5b'}) 
vn.connect_to_sqlite(db_name)

# 导入 schema
df_ddl = vn.run_sql("SELECT type, sql FROM sqlite_master WHERE sql is not null")
for ddl in df_ddl['sql']:
    print("DDL:", ddl)
    vn.train(ddl=ddl)

# 训练文档说明
vn.train(documentation="示例数据库包含部门信息表 departments,字段有 dept_id, dept_name, manager_name, location")
vn.train(documentation="示例数据库包含营业额表 revenue,字段有 id, dept_id, year, month, revenue_amount")
vn.train(documentation="数据包含2023年和2024年每个部门每个月的营业额数据")

# 训练一些示例 SQL
vn.train(sql="SELECT d.dept_name, SUM(r.revenue_amount) as total_revenue FROM departments d JOIN revenue r ON d.dept_id = r.dept_id WHERE r.year = 2024 GROUP BY d.dept_id, d.dept_name", question="查询2024年各部门总营业额")
vn.train(sql="SELECT year, AVG(revenue_amount) as avg_revenue FROM revenue GROUP BY year", question="查询各年平均营业额")
vn.train(sql="SELECT d.dept_name, r.year, r.month, r.revenue_amount FROM departments d JOIN revenue r ON d.dept_id = r.dept_id ORDER BY d.dept_name, r.year, r.month", question="查询所有部门营业额明细")
vn.train(sql="""
SELECT 
    d.dept_name,
    SUM(CASE WHEN r.year = 2023 THEN r.revenue_amount ELSE 0 END) as revenue_2023,
    SUM(CASE WHEN r.year = 2024 THEN r.revenue_amount ELSE 0 END) as revenue_2024,
    SUM(CASE WHEN r.year = 2024 THEN r.revenue_amount ELSE 0 END) - SUM(CASE WHEN r.year = 2023 THEN r.revenue_amount ELSE 0 END) as change_amount
FROM departments d 
JOIN revenue r ON d.dept_id = r.dept_id 
WHERE r.year IN (2023, 2024)
GROUP BY d.dept_id, d.dept_name
""", question="比较各部门2023年和2024年营业额变化")
vn.train(sql="""
WITH monthly_revenue AS (
    SELECT d.dept_name, r.year, r.month, SUM(r.revenue_amount) as monthly_total
    FROM departments d 
    JOIN revenue r ON d.dept_id = r.dept_id 
    WHERE r.year = 2024
    GROUP BY d.dept_name, r.year, r.month
),
growth_rates AS (
    SELECT 
        dept_name,
        month,
        monthly_total,
        LAG(monthly_total) OVER (PARTITION BY dept_name ORDER BY month) as prev_month_total,
        CASE 
            WHEN LAG(monthly_total) OVER (PARTITION BY dept_name ORDER BY month) > 0 
            THEN (monthly_total - LAG(monthly_total) OVER (PARTITION BY dept_name ORDER BY month)) * 100.0 / LAG(monthly_total) OVER (PARTITION BY dept_name ORDER BY month)
            ELSE 0 
        END as growth_rate
    FROM monthly_revenue
)
SELECT dept_name, month, growth_rate FROM growth_rates WHERE growth_rate IS NOT NULL
""", question="计算各部门2024年月度增长率")

# 问题1:哪些部门比去年营业额下降了?

question1 = "比较各部门2023年和2024年的总营业额,横轴部门,纵轴2023年和2024年营业额横向并排显示,用柱状图"
print(f"\n=== 问题1:{question1} ===")
result1 = vn.ask(question=question1, visualize=False)

if isinstance(result1, tuple) and len(result1) >= 2:
    sql, df = result1[0], result1[1]
    print("SQL:", sql)
    print("查询结果:")
    print(df)
    
    # 确保sql和df不为None
    if sql and df is not None:
        # 尝试生成可视化
        try:
            plotly_code = vn.generate_plotly_code(question=question1, sql=sql, df=df)
            print("生成的图表代码:")
            print(plotly_code)
            
            if plotly_code:
                fig = vn.get_plotly_figure(plotly_code, df)
                if fig:
                    fig.write_html("department_decline_chart.html")
                    print("图表已保存到 department_decline_chart.html")
                    # fig.show()
        except Exception as e:
            print(f"生成可视化时出错: {e}")
    else:
        print("SQL或数据为空")
else:
    print("结果:", result1)

print("\n" + "="*50 + "\n")

# 问题2:各个部门今年的月度营业额平均增长率,要求可视化
print("=== 问题2:各个部门今年的月度营业额平均增长率 ===")
result2 = vn.ask(question="计算各个部门2024年每月营业额相比上月的增长率,用曲线图,横轴月份,纵轴增长率", visualize=True)

if result2:
    print("结果类型:", type(result2))
    if isinstance(result2, tuple) and len(result2) >= 2:
        print("SQL:", result2[0])
        print("数据结果:")
        print(result2[1])
        
        # 检查是否有图表数据
        if len(result2) >= 3 and result2[2] is not None:
            print("可视化图表:", type(result2[2]))
            try:
                # 如果result[2]是字符串代码,执行它
                if isinstance(result2[2], str):
                    print("图表代码:")
                    print(result2[2])
                    
                    # 创建一个安全的执行环境
                    local_vars = {
                        'px': px,
                        'go': go,
                        'pd': pd,
                        'df': result2[1]  # 使用查询结果作为df
                    }
                    
                    # 执行代码生成图表
                    exec(result2[2], globals(), local_vars)
                    
                    # 查找生成的figure对象
                    fig = None
                    for var_name, var_value in local_vars.items():
                        if hasattr(var_value, 'write_html') and hasattr(var_value, 'show'):
                            fig = var_value
                            break
                    
                    if fig:
                        # 保存图表
                        fig.write_html("department_growth_chart.html")
                        print("图表已保存到 department_growth_chart.html")
                        
                        try:
                            fig.write_image("department_growth_chart.png")
                            print("图表已保存到 department_growth_chart.png")
                        except Exception as e:
                            print(f"保存PNG失败 (需要kaleido): {e}")
                        
                        # 显示图表
                        # fig.show()
                    else:
                        print("未找到生成的图表对象")
                        
                else:
                    # 如果result[2]已经是图表对象
                    result2[2].write_html("department_growth_chart.html")
                    print("图表已保存到 department_growth_chart.html")
                    
                    try:
                        result2[2].write_image("department_growth_chart.png")
                        print("图表已保存到 department_growth_chart.png")
                    except Exception as e:
                        print(f"保存PNG失败 (需要kaleido): {e}")
                    
                    result2[2].show()
                    
            except Exception as e:
                print(f"处理图表时出错: {e}")
        else:
            print("没有找到图表数据")
    else:
        print("结果:", result2)

3.1 清理目录 + 构造 SQLite 示例库

代码语言:javascript
复制
# 删除残留向量库文件
if os.path.exists('chroma.sqlite3'): os.remove('chroma.sqlite3')
# 创建两张表:departments / revenue
# 用随机数 + 季节因子 + 年增长率 → 可重复但略带波动的月度营业额
  • 构造 2023、2024 两年 × 3 部门 × 12 月,方便做同比/环比/增速。
  • random.uniform() 添加 ±20 % 以内随机波动,更贴近真实业务。

3.2 Vanna 入口类:MyVanna

代码语言:javascript
复制
class MyVanna(ChromaDB_VectorStore, Ollama):
    ...
  • 多重继承同时绑定:
    1. 向量库实现(ChromaDB);
    2. LLM 调用(Ollama‑Qwen)。
  • Vanna SDK 会在内部自动把 (自然语言问题 ↔ prompt)、(上下文 ↔ 向量检索)、(SQL ↔ LLM) 串起来。

3.3 训练三件套

  1. DDLvn.train(ddl=...)
    • 让 LLM 清楚表结构、字段名、主键外键。
  2. 文档vn.train(documentation=...)
    • 辅助说明业务含义(如“营业额单位:元”)。
  3. 示例 SQL ↔ 问题vn.train(sql=..., question=...)
    • Few‑shot 提升复杂查询准确率;日后 ask 时也会先检索这些样例再作推理。

❝最佳实践

  • “每张表 1 ~ 2 条简单查询 + 1 条 JOIN 查询”基本可以覆盖大多问题;
  • 如果命中率不够,再增补示例,而不是一开始就喂大量 SQL。

3.4 ask():自然语言 → SQL → DataFrame [→ Plotly]

问题 1:跨年总营业额对比
  • LLM 生成 SQL(含 case‑when 求和)→ 正常执行
  • generate_plotly_code 按列类型自动选用 **px.bar**,barmode="group" 满足“横向并排”需求。
  • 写文件fig.write_html 可嵌入任何前端;write_image 依赖 kaleido。

以下是生成的交互式图:

在这里插入图片描述
在这里插入图片描述

在这里插入图片描述

问题 2:月度环比增速

这个问题,LLM没有返回正确的SQL语句,执行失败了。

  • LLM 返回的 SQL 没有别名 r:最终 JOIN 阶段错误 → no such column: r.month

附完整日志:

代码语言:javascript
复制
营业额数据:
部门ID: 1, 年份: 2023, 月份: 1, 营业额: 1498
部门ID: 1, 年份: 2023, 月份: 2, 营业额: 1593
部门ID: 1, 年份: 2023, 月份: 3, 营业额: 1689
部门ID: 1, 年份: 2023, 月份: 4, 营业额: 1536
部门ID: 1, 年份: 2023, 月份: 5, 营业额: 1369
部门ID: 1, 年份: 2023, 月份: 6, 营业额: 1260
部门ID: 1, 年份: 2023, 月份: 7, 营业额: 1501
部门ID: 1, 年份: 2023, 月份: 8, 营业额: 1435
部门ID: 1, 年份: 2023, 月份: 9, 营业额: 1292
部门ID: 1, 年份: 2023, 月份: 10, 营业额: 1698
部门ID: 1, 年份: 2023, 月份: 11, 营业额: 1844
部门ID: 1, 年份: 2023, 月份: 12, 营业额: 1534
部门ID: 1, 年份: 2024, 月份: 1, 营业额: 1156
部门ID: 1, 年份: 2024, 月份: 2, 营业额: 1590
部门ID: 1, 年份: 2024, 月份: 3, 营业额: 1286
部门ID: 1, 年份: 2024, 月份: 4, 营业额: 1373
部门ID: 1, 年份: 2024, 月份: 5, 营业额: 1442
部门ID: 1, 年份: 2024, 月份: 6, 营业额: 1248
部门ID: 1, 年份: 2024, 月份: 7, 营业额: 1593
部门ID: 1, 年份: 2024, 月份: 8, 营业额: 1155
部门ID: 1, 年份: 2024, 月份: 9, 营业额: 1488
部门ID: 1, 年份: 2024, 月份: 10, 营业额: 1165
部门ID: 1, 年份: 2024, 月份: 11, 营业额: 1297
部门ID: 1, 年份: 2024, 月份: 12, 营业额: 1312
部门ID: 2, 年份: 2023, 月份: 1, 营业额: 1188
部门ID: 2, 年份: 2023, 月份: 2, 营业额: 1171
部门ID: 2, 年份: 2023, 月份: 3, 营业额: 1207
部门ID: 2, 年份: 2023, 月份: 4, 营业额: 918
部门ID: 2, 年份: 2023, 月份: 5, 营业额: 1022
部门ID: 2, 年份: 2023, 月份: 6, 营业额: 1064
部门ID: 2, 年份: 2023, 月份: 7, 营业额: 1182
部门ID: 2, 年份: 2023, 月份: 8, 营业额: 1189
部门ID: 2, 年份: 2023, 月份: 9, 营业额: 1147
部门ID: 2, 年份: 2023, 月份: 10, 营业额: 1294
部门ID: 2, 年份: 2023, 月份: 11, 营业额: 1469
部门ID: 2, 年份: 2023, 月份: 12, 营业额: 1042
部门ID: 2, 年份: 2024, 月份: 1, 营业额: 1242
部门ID: 2, 年份: 2024, 月份: 2, 营业额: 1389
部门ID: 2, 年份: 2024, 月份: 3, 营业额: 1472
部门ID: 2, 年份: 2024, 月份: 4, 营业额: 1347
部门ID: 2, 年份: 2024, 月份: 5, 营业额: 1257
部门ID: 2, 年份: 2024, 月份: 6, 营业额: 1279
部门ID: 2, 年份: 2024, 月份: 7, 营业额: 1535
部门ID: 2, 年份: 2024, 月份: 8, 营业额: 1231
部门ID: 2, 年份: 2024, 月份: 9, 营业额: 1476
部门ID: 2, 年份: 2024, 月份: 10, 营业额: 1419
部门ID: 2, 年份: 2024, 月份: 11, 营业额: 1663
部门ID: 2, 年份: 2024, 月份: 12, 营业额: 1188
部门ID: 3, 年份: 2023, 月份: 1, 营业额: 1631
部门ID: 3, 年份: 2023, 月份: 2, 营业额: 1629
部门ID: 3, 年份: 2023, 月份: 3, 营业额: 1856
部门ID: 3, 年份: 2023, 月份: 4, 营业额: 1623
部门ID: 3, 年份: 2023, 月份: 5, 营业额: 1798
部门ID: 3, 年份: 2023, 月份: 6, 营业额: 1485
部门ID: 3, 年份: 2023, 月份: 7, 营业额: 1748
部门ID: 3, 年份: 2023, 月份: 8, 营业额: 1702
部门ID: 3, 年份: 2023, 月份: 9, 营业额: 1550
部门ID: 3, 年份: 2023, 月份: 10, 营业额: 1848
部门ID: 3, 年份: 2023, 月份: 11, 营业额: 2194
部门ID: 3, 年份: 2023, 月份: 12, 营业额: 1681
部门ID: 3, 年份: 2024, 月份: 1, 营业额: 1734
部门ID: 3, 年份: 2024, 月份: 2, 营业额: 2362
部门ID: 3, 年份: 2024, 月份: 3, 营业额: 2392
部门ID: 3, 年份: 2024, 月份: 4, 营业额: 2137
部门ID: 3, 年份: 2024, 月份: 5, 营业额: 2127
部门ID: 3, 年份: 2024, 月份: 6, 营业额: 2192
部门ID: 3, 年份: 2024, 月份: 7, 营业额: 2041
部门ID: 3, 年份: 2024, 月份: 8, 营业额: 2042
部门ID: 3, 年份: 2024, 月份: 9, 营业额: 2171
部门ID: 3, 年份: 2024, 月份: 10, 营业额: 2344
部门ID: 3, 年份: 2024, 月份: 11, 营业额: 2499
部门ID: 3, 年份: 2024, 月份: 12, 营业额: 2193
数据库创建完成,已插入示例数据
DDL: CREATE TABLE departments (
        dept_id INTEGER PRIMARY KEY,
        dept_name TEXT NOT NULL,
        manager_name TEXT NOT NULL,
        location TEXT NOT NULL
    )
Adding ddl: CREATE TABLE departments (
        dept_id INTEGER PRIMARY KEY,
        dept_name TEXT NOT NULL,
        manager_name TEXT NOT NULL,
        location TEXT NOT NULL
    )
DDL: CREATE TABLE revenue (
        id INTEGER PRIMARY KEY AUTOINCREMENT,
        dept_id INTEGER,
        year INTEGER NOT NULL,
        month INTEGER NOT NULL,
        revenue_amount DECIMAL(12,2) NOT NULL,
        FOREIGN KEY (dept_id) REFERENCES departments (dept_id)
    )
Adding ddl: CREATE TABLE revenue (
        id INTEGER PRIMARY KEY AUTOINCREMENT,
        dept_id INTEGER,
        year INTEGER NOT NULL,
        month INTEGER NOT NULL,
        revenue_amount DECIMAL(12,2) NOT NULL,
        FOREIGN KEY (dept_id) REFERENCES departments (dept_id)
    )
DDL: CREATE TABLE sqlite_sequence(name,seq)
Adding ddl: CREATE TABLE sqlite_sequence(name,seq)
Adding documentation....
Adding documentation....
Adding documentation....

=== 问题1:比较各部门2023年和2024年的总营业额,横轴部门,纵轴2023年和2024年营业额横向并排显示,用柱状图 ===
SQL Prompt: [{'role': 'system', 'content': "You are a SQLite expert. Please help to generate a SQL query to answer the question. Your response should ONLY be based on the given context and follow the response guidelines and format instructions. \n===Tables \nCREATE TABLE revenue (\n        id INTEGER PRIMARY KEY AUTOINCREMENT,\n        dept_id INTEGER,\n        year INTEGER NOT NULL,\n        month INTEGER NOT NULL,\n        revenue_amount DECIMAL(12,2) NOT NULL,\n        FOREIGN KEY (dept_id) REFERENCES departments (dept_id)\n    )\n\nCREATE TABLE departments (\n        dept_id INTEGER PRIMARY KEY,\n        dept_name TEXT NOT NULL,\n        manager_name TEXT NOT NULL,\n        location TEXT NOT NULL\n    )\n\nCREATE TABLE sqlite_sequence(name,seq)\n\n\n===Additional Context \n\n数据包含2023年和2024年每个部门每个月的营业额数据\n\n示例数据库包含部门信息表 departments,字段有 dept_id, dept_name, manager_name, location\n\n示例数据库包含营业额表 revenue,字段有 id, dept_id, year, month, revenue_amount\n\n===Response Guidelines \n1. If the provided context is sufficient, please generate a valid SQL query without any explanations for the question. \n2. If the provided context is almost sufficient but requires knowledge of a specific string in a particular column, please generate an intermediate SQL query to find the distinct strings in that column. Prepend the query with a comment saying intermediate_sql \n3. If the provided context is insufficient, please explain why it can't be generated. \n4. Please use the most relevant table(s). \n5. If the question has been asked and answered before, please repeat the answer exactly as it was given before. \n6. Ensure that the output SQL is SQLite-compliant and executable, and free of syntax errors. \n"}, {'role': 'user', 'content': '比较各部门2023年和2024年营业额变化'}, {'role': 'assistant', 'content': '\nSELECT \n    d.dept_name,\n    SUM(CASE WHEN r.year = 2023 THEN r.revenue_amount ELSE 0 END) as revenue_2023,\n    SUM(CASE WHEN r.year = 2024 THEN r.revenue_amount ELSE 0 END) as revenue_2024,\n    SUM(CASE WHEN r.year = 2024 THEN r.revenue_amount ELSE 0 END) - SUM(CASE WHEN r.year = 2023 THEN r.revenue_amount ELSE 0 END) as change_amount\nFROM departments d \nJOIN revenue r ON d.dept_id = r.dept_id \nWHERE r.year IN (2023, 2024)\nGROUP BY d.dept_id, d.dept_name\n'}, {'role': 'user', 'content': '查询2024年各部门总营业额'}, {'role': 'assistant', 'content': 'SELECT d.dept_name, SUM(r.revenue_amount) as total_revenue FROM departments d JOIN revenue r ON d.dept_id = r.dept_id WHERE r.year = 2024 GROUP BY d.dept_id, d.dept_name'}, {'role': 'user', 'content': '查询所有部门营业额明细'}, {'role': 'assistant', 'content': 'SELECT d.dept_name, r.year, r.month, r.revenue_amount FROM departments d JOIN revenue r ON d.dept_id = r.dept_id ORDER BY d.dept_name, r.year, r.month'}, {'role': 'user', 'content': '查询各年平均营业额'}, {'role': 'assistant', 'content': 'SELECT year, AVG(revenue_amount) as avg_revenue FROM revenue GROUP BY year'}, {'role': 'user', 'content': '计算各部门2024年月度增长率'}, {'role': 'assistant', 'content': '\nWITH monthly_revenue AS (\n    SELECT d.dept_name, r.year, r.month, SUM(r.revenue_amount) as monthly_total\n    FROM departments d \n    JOIN revenue r ON d.dept_id = r.dept_id \n    WHERE r.year = 2024\n    GROUP BY d.dept_name, r.year, r.month\n),\ngrowth_rates AS (\n    SELECT \n        dept_name,\n        month,\n        monthly_total,\n        LAG(monthly_total) OVER (PARTITION BY dept_name ORDER BY month) as prev_month_total,\n        CASE \n            WHEN LAG(monthly_total) OVER (PARTITION BY dept_name ORDER BY month) > 0 \n            THEN (monthly_total - LAG(monthly_total) OVER (PARTITION BY dept_name ORDER BY month)) * 100.0 / LAG(monthly_total) OVER (PARTITION BY dept_name ORDER BY month)\n            ELSE 0 \n        END as growth_rate\n    FROM monthly_revenue\n)\nSELECT dept_name, month, growth_rate FROM growth_rates WHERE growth_rate IS NOT NULL\n'}, {'role': 'user', 'content': '比较各部门2023年和2024年的总营业额,横轴部门,纵轴2023年和2024年营业额横向并排显示,用柱状图'}]

Info: Ollama parameters:
model=qwen2.5:1.5b,
options={},
keep_alive=None

Info: Prompt Content:
[{"role": "system", "content": "You are a SQLite expert. Please help to generate a SQL query to answer the question. Your response should ONLY be based on the given context and follow the response guidelines and format instructions. \n===Tables \nCREATE TABLE revenue (\n        id INTEGER PRIMARY KEY AUTOINCREMENT,\n        dept_id INTEGER,\n        year INTEGER NOT NULL,\n        month INTEGER NOT NULL,\n        revenue_amount DECIMAL(12,2) NOT NULL,\n        FOREIGN KEY (dept_id) REFERENCES departments (dept_id)\n    )\n\nCREATE TABLE departments (\n        dept_id INTEGER PRIMARY KEY,\n        dept_name TEXT NOT NULL,\n        manager_name TEXT NOT NULL,\n        location TEXT NOT NULL\n    )\n\nCREATE TABLE sqlite_sequence(name,seq)\n\n\n===Additional Context \n\n数据包含2023年和2024年每个部门每个月的营业额数据\n\n示例数据库包含部门信息表 departments,字段有 dept_id, dept_name, manager_name, location\n\n示例数据库包含营业额表 revenue,字段有 id, dept_id, year, month, revenue_amount\n\n===Response Guidelines \n1. If the provided context is sufficient, please generate a valid SQL query without any explanations for the question. \n2. If the provided context is almost sufficient but requires knowledge of a specific string in a particular column, please generate an intermediate SQL query to find the distinct strings in that column. Prepend the query with a comment saying intermediate_sql \n3. If the provided context is insufficient, please explain why it can't be generated. \n4. Please use the most relevant table(s). \n5. If the question has been asked and answered before, please repeat the answer exactly as it was given before. \n6. Ensure that the output SQL is SQLite-compliant and executable, and free of syntax errors. \n"}, {"role": "user", "content": "比较各部门2023年和2024年营业额变化"}, {"role": "assistant", "content": "\nSELECT \n    d.dept_name,\n    SUM(CASE WHEN r.year = 2023 THEN r.revenue_amount ELSE 0 END) as revenue_2023,\n    SUM(CASE WHEN r.year = 2024 THEN r.revenue_amount ELSE 0 END) as revenue_2024,\n    SUM(CASE WHEN r.year = 2024 THEN r.revenue_amount ELSE 0 END) - SUM(CASE WHEN r.year = 2023 THEN r.revenue_amount ELSE 0 END) as change_amount\nFROM departments d \nJOIN revenue r ON d.dept_id = r.dept_id \nWHERE r.year IN (2023, 2024)\nGROUP BY d.dept_id, d.dept_name\n"}, {"role": "user", "content": "查询2024年各部门总营业额"}, {"role": "assistant", "content": "SELECT d.dept_name, SUM(r.revenue_amount) as total_revenue FROM departments d JOIN revenue r ON d.dept_id = r.dept_id WHERE r.year = 2024 GROUP BY d.dept_id, d.dept_name"}, {"role": "user", "content": "查询所有部门营业额明细"}, {"role": "assistant", "content": "SELECT d.dept_name, r.year, r.month, r.revenue_amount FROM departments d JOIN revenue r ON d.dept_id = r.dept_id ORDER BY d.dept_name, r.year, r.month"}, {"role": "user", "content": "查询各年平均营业额"}, {"role": "assistant", "content": "SELECT year, AVG(revenue_amount) as avg_revenue FROM revenue GROUP BY year"}, {"role": "user", "content": "计算各部门2024年月度增长率"}, {"role": "assistant", "content": "\nWITH monthly_revenue AS (\n    SELECT d.dept_name, r.year, r.month, SUM(r.revenue_amount) as monthly_total\n    FROM departments d \n    JOIN revenue r ON d.dept_id = r.dept_id \n    WHERE r.year = 2024\n    GROUP BY d.dept_name, r.year, r.month\n),\ngrowth_rates AS (\n    SELECT \n        dept_name,\n        month,\n        monthly_total,\n        LAG(monthly_total) OVER (PARTITION BY dept_name ORDER BY month) as prev_month_total,\n        CASE \n            WHEN LAG(monthly_total) OVER (PARTITION BY dept_name ORDER BY month) > 0 \n            THEN (monthly_total - LAG(monthly_total) OVER (PARTITION BY dept_name ORDER BY month)) * 100.0 / LAG(monthly_total) OVER (PARTITION BY dept_name ORDER BY month)\n            ELSE 0 \n        END as growth_rate\n    FROM monthly_revenue\n)\nSELECT dept_name, month, growth_rate FROM growth_rates WHERE growth_rate IS NOT NULL\n"}, {"role": "user", "content": "比较各部门2023年和2024年的总营业额,横轴部门,纵轴2023年和2024年营业额横向并排显示,用柱状图"}]

Info: Ollama Response:
model='qwen2.5:1.5b' created_at='2025-07-12T04:30:42.351542212Z' done=True done_reason='stop' total_duration=12367885158 load_duration=189836712 prompt_eval_count=1012 prompt_eval_duration=1987914376 eval_count=116 eval_duration=10049151550 message=Message(role='assistant', content='```sql\nSELECT \n    d.dept_name,\n    SUM(CASE WHEN r.year = 2023 THEN r.revenue_amount ELSE 0 END) as revenue_2023,\n    SUM(CASE WHEN r.year = 2024 THEN r.revenue_amount ELSE 0 END) as revenue_2024\nFROM departments d \nJOIN revenue r ON d.dept_id = r.dept_id \nWHERE r.year IN (2023, 2024)\nGROUP BY d.dept_name;\n```', thinking=None, images=None, tool_calls=None)

LLM Response: ``~sql
SELECT 
    d.dept_name,
    SUM(CASE WHEN r.year = 2023 THEN r.revenue_amount ELSE 0 END) as revenue_2023,
    SUM(CASE WHEN r.year = 2024 THEN r.revenue_amount ELSE 0 END) as revenue_2024
FROM departments d 
JOIN revenue r ON d.dept_id = r.dept_id 
WHERE r.year IN (2023, 2024)
GROUP BY d.dept_name;
``~
Info: Output from LLM: ``sql
SELECT 
    d.dept_name,
    SUM(CASE WHEN r.year = 2023 THEN r.revenue_amount ELSE 0 END) as revenue_2023,
    SUM(CASE WHEN r.year = 2024 THEN r.revenue_amount ELSE 0 END) as revenue_2024
FROM departments d 
JOIN revenue r ON d.dept_id = r.dept_id 
WHERE r.year IN (2023, 2024)
GROUP BY d.dept_name;
``

Extracted SQL: SELECT 
    d.dept_name,
    SUM(CASE WHEN r.year = 2023 THEN r.revenue_amount ELSE 0 END) as revenue_2023,
    SUM(CASE WHEN r.year = 2024 THEN r.revenue_amount ELSE 0 END) as revenue_2024
FROM departments d 
JOIN revenue r ON d.dept_id = r.dept_id 
WHERE r.year IN (2023, 2024)
GROUP BY d.dept_name
SELECT 
    d.dept_name,
    SUM(CASE WHEN r.year = 2023 THEN r.revenue_amount ELSE 0 END) as revenue_2023,
    SUM(CASE WHEN r.year = 2024 THEN r.revenue_amount ELSE 0 END) as revenue_2024
FROM departments d 
JOIN revenue r ON d.dept_id = r.dept_id 
WHERE r.year IN (2023, 2024)
GROUP BY d.dept_name
  dept_name  revenue_2023  revenue_2024
0       市场部         13893         16498
1       技术部         20745         26234
2       销售部         18249         16105

SQL: SELECT 
    d.dept_name,
    SUM(CASE WHEN r.year = 2023 THEN r.revenue_amount ELSE 0 END) as revenue_2023,
    SUM(CASE WHEN r.year = 2024 THEN r.revenue_amount ELSE 0 END) as revenue_2024
FROM departments d 
JOIN revenue r ON d.dept_id = r.dept_id 
WHERE r.year IN (2023, 2024)
GROUP BY d.dept_name

查询结果:
  dept_name  revenue_2023  revenue_2024
0       市场部         13893         16498
1       技术部         20745         26234
2       销售部         18249         16105

Info: Ollama parameters:
model=qwen2.5:1.5b,
options={},
keep_alive=None

Info: Prompt Content:
[{"role": "system", "content": "The following is a pandas DataFrame that contains the results of the query that answers the question the user asked: '比较各部门2023年和2024年的总营业额,横轴部门,纵轴2023年和2024年营业额横向并排显示,用柱状图'\n\nThe DataFrame was produced using this query: SELECT \n    d.dept_name,\n    SUM(CASE WHEN r.year = 2023 THEN r.revenue_amount ELSE 0 END) as revenue_2023,\n    SUM(CASE WHEN r.year = 2024 THEN r.revenue_amount ELSE 0 END) as revenue_2024\nFROM departments d \nJOIN revenue r ON d.dept_id = r.dept_id \nWHERE r.year IN (2023, 2024)\nGROUP BY d.dept_name\n\nThe following is information about the resulting pandas DataFrame 'df': \nNone"}, {"role": "user", "content": "Can you generate the Python plotly code to chart the results of the dataframe? Assume the data is in a pandas dataframe called 'df'. If there is only one value in the dataframe, use an Indicator. Respond with only Python code. Do not answer with any explanations -- just the code."}]

Info: Ollama Response:
model='qwen2.5:1.5b' created_at='2025-07-12T04:30:46.930188474Z' done=True done_reason='stop' total_duration=3952451065 load_duration=202256087 prompt_eval_count=271 prompt_eval_duration=19133267 eval_count=38 eval_duration=3725519979 message=Message(role='assistant', content='``python\nimport plotly.express as px\n\nfig = px.bar(df, x="dept_name", y=df.columns.values, barmode="group")\nfig.show()\n``', thinking=None, images=None, tool_calls=None)

生成的图表代码:
import plotly.express as px

fig = px.bar(df, x="dept_name", y=df.columns.values, barmode="group")

图表已保存到 department_decline_chart.html

==================================================

=== 问题2:各个部门今年的月度营业额平均增长率 ===
SQL Prompt: [{'role': 'system', 'content': "You are a SQLite expert. Please help to generate a SQL query to answer the question. Your response should ONLY be based on the given context and follow the response guidelines and format instructions. \n===Tables \nCREATE TABLE sqlite_sequence(name,seq)\n\nCREATE TABLE revenue (\n        id INTEGER PRIMARY KEY AUTOINCREMENT,\n        dept_id INTEGER,\n        year INTEGER NOT NULL,\n        month INTEGER NOT NULL,\n        revenue_amount DECIMAL(12,2) NOT NULL,\n        FOREIGN KEY (dept_id) REFERENCES departments (dept_id)\n    )\n\nCREATE TABLE departments (\n        dept_id INTEGER PRIMARY KEY,\n        dept_name TEXT NOT NULL,\n        manager_name TEXT NOT NULL,\n        location TEXT NOT NULL\n    )\n\n\n===Additional Context \n\n数据包含2023年和2024年每个部门每个月的营业额数据\n\n示例数据库包含营业额表 revenue,字段有 id, dept_id, year, month, revenue_amount\n\n示例数据库包含部门信息表 departments,字段有 dept_id, dept_name, manager_name, location\n\n===Response Guidelines \n1. If the provided context is sufficient, please generate a valid SQL query without any explanations for the question. \n2. If the provided context is almost sufficient but requires knowledge of a specific string in a particular column, please generate an intermediate SQL query to find the distinct strings in that column. Prepend the query with a comment saying intermediate_sql \n3. If the provided context is insufficient, please explain why it can't be generated. \n4. Please use the most relevant table(s). \n5. If the question has been asked and answered before, please repeat the answer exactly as it was given before. \n6. Ensure that the output SQL is SQLite-compliant and executable, and free of syntax errors. \n"}, {'role': 'user', 'content': '比较各部门2023年和2024年的总营业额,横轴部门,纵轴2023年和2024年营业额横向并排显示,用柱状图'}, {'role': 'assistant', 'content': 'SELECT \n    d.dept_name,\n    SUM(CASE WHEN r.year = 2023 THEN r.revenue_amount ELSE 0 END) as revenue_2023,\n    SUM(CASE WHEN r.year = 2024 THEN r.revenue_amount ELSE 0 END) as revenue_2024\nFROM departments d \nJOIN revenue r ON d.dept_id = r.dept_id \nWHERE r.year IN (2023, 2024)\nGROUP BY d.dept_name'}, {'role': 'user', 'content': '比较各部门2023年和2024年营业额变化'}, {'role': 'assistant', 'content': '\nSELECT \n    d.dept_name,\n    SUM(CASE WHEN r.year = 2023 THEN r.revenue_amount ELSE 0 END) as revenue_2023,\n    SUM(CASE WHEN r.year = 2024 THEN r.revenue_amount ELSE 0 END) as revenue_2024,\n    SUM(CASE WHEN r.year = 2024 THEN r.revenue_amount ELSE 0 END) - SUM(CASE WHEN r.year = 2023 THEN r.revenue_amount ELSE 0 END) as change_amount\nFROM departments d \nJOIN revenue r ON d.dept_id = r.dept_id \nWHERE r.year IN (2023, 2024)\nGROUP BY d.dept_id, d.dept_name\n'}, {'role': 'user', 'content': '查询2024年各部门总营业额'}, {'role': 'assistant', 'content': 'SELECT d.dept_name, SUM(r.revenue_amount) as total_revenue FROM departments d JOIN revenue r ON d.dept_id = r.dept_id WHERE r.year = 2024 GROUP BY d.dept_id, d.dept_name'}, {'role': 'user', 'content': '查询所有部门营业额明细'}, {'role': 'assistant', 'content': 'SELECT d.dept_name, r.year, r.month, r.revenue_amount FROM departments d JOIN revenue r ON d.dept_id = r.dept_id ORDER BY d.dept_name, r.year, r.month'}, {'role': 'user', 'content': '查询各年平均营业额'}, {'role': 'assistant', 'content': 'SELECT year, AVG(revenue_amount) as avg_revenue FROM revenue GROUP BY year'}, {'role': 'user', 'content': '计算各部门2024年月度增长率'}, {'role': 'assistant', 'content': '\nWITH monthly_revenue AS (\n    SELECT d.dept_name, r.year, r.month, SUM(r.revenue_amount) as monthly_total\n    FROM departments d \n    JOIN revenue r ON d.dept_id = r.dept_id \n    WHERE r.year = 2024\n    GROUP BY d.dept_name, r.year, r.month\n),\ngrowth_rates AS (\n    SELECT \n        dept_name,\n        month,\n        monthly_total,\n        LAG(monthly_total) OVER (PARTITION BY dept_name ORDER BY month) as prev_month_total,\n        CASE \n            WHEN LAG(monthly_total) OVER (PARTITION BY dept_name ORDER BY month) > 0 \n            THEN (monthly_total - LAG(monthly_total) OVER (PARTITION BY dept_name ORDER BY month)) * 100.0 / LAG(monthly_total) OVER (PARTITION BY dept_name ORDER BY month)\n            ELSE 0 \n        END as growth_rate\n    FROM monthly_revenue\n)\nSELECT dept_name, month, growth_rate FROM growth_rates WHERE growth_rate IS NOT NULL\n'}, {'role': 'user', 'content': '计算各个部门2024年每月营业额相比上月的增长率,用曲线图,横轴月份,纵轴增长率'}]

Info: Ollama parameters:
model=qwen2.5:1.5b,
options={},
keep_alive=None

Info: Prompt Content:
[{"role": "system", "content": "You are a SQLite expert. Please help to generate a SQL query to answer the question. Your response should ONLY be based on the given context and follow the response guidelines and format instructions. \n===Tables \nCREATE TABLE sqlite_sequence(name,seq)\n\nCREATE TABLE revenue (\n        id INTEGER PRIMARY KEY AUTOINCREMENT,\n        dept_id INTEGER,\n        year INTEGER NOT NULL,\n        month INTEGER NOT NULL,\n        revenue_amount DECIMAL(12,2) NOT NULL,\n        FOREIGN KEY (dept_id) REFERENCES departments (dept_id)\n    )\n\nCREATE TABLE departments (\n        dept_id INTEGER PRIMARY KEY,\n        dept_name TEXT NOT NULL,\n        manager_name TEXT NOT NULL,\n        location TEXT NOT NULL\n    )\n\n\n===Additional Context \n\n数据包含2023年和2024年每个部门每个月的营业额数据\n\n示例数据库包含营业额表 revenue,字段有 id, dept_id, year, month, revenue_amount\n\n示例数据库包含部门信息表 departments,字段有 dept_id, dept_name, manager_name, location\n\n===Response Guidelines \n1. If the provided context is sufficient, please generate a valid SQL query without any explanations for the question. \n2. If the provided context is almost sufficient but requires knowledge of a specific string in a particular column, please generate an intermediate SQL query to find the distinct strings in that column. Prepend the query with a comment saying intermediate_sql \n3. If the provided context is insufficient, please explain why it can't be generated. \n4. Please use the most relevant table(s). \n5. If the question has been asked and answered before, please repeat the answer exactly as it was given before. \n6. Ensure that the output SQL is SQLite-compliant and executable, and free of syntax errors. \n"}, {"role": "user", "content": "比较各部门2023年和2024年的总营业额,横轴部门,纵轴2023年和2024年营业额横向并排显示,用柱状图"}, {"role": "assistant", "content": "SELECT \n    d.dept_name,\n    SUM(CASE WHEN r.year = 2023 THEN r.revenue_amount ELSE 0 END) as revenue_2023,\n    SUM(CASE WHEN r.year = 2024 THEN r.revenue_amount ELSE 0 END) as revenue_2024\nFROM departments d \nJOIN revenue r ON d.dept_id = r.dept_id \nWHERE r.year IN (2023, 2024)\nGROUP BY d.dept_name"}, {"role": "user", "content": "比较各部门2023年和2024年营业额变化"}, {"role": "assistant", "content": "\nSELECT \n    d.dept_name,\n    SUM(CASE WHEN r.year = 2023 THEN r.revenue_amount ELSE 0 END) as revenue_2023,\n    SUM(CASE WHEN r.year = 2024 THEN r.revenue_amount ELSE 0 END) as revenue_2024,\n    SUM(CASE WHEN r.year = 2024 THEN r.revenue_amount ELSE 0 END) - SUM(CASE WHEN r.year = 2023 THEN r.revenue_amount ELSE 0 END) as change_amount\nFROM departments d \nJOIN revenue r ON d.dept_id = r.dept_id \nWHERE r.year IN (2023, 2024)\nGROUP BY d.dept_id, d.dept_name\n"}, {"role": "user", "content": "查询2024年各部门总营业额"}, {"role": "assistant", "content": "SELECT d.dept_name, SUM(r.revenue_amount) as total_revenue FROM departments d JOIN revenue r ON d.dept_id = r.dept_id WHERE r.year = 2024 GROUP BY d.dept_id, d.dept_name"}, {"role": "user", "content": "查询所有部门营业额明细"}, {"role": "assistant", "content": "SELECT d.dept_name, r.year, r.month, r.revenue_amount FROM departments d JOIN revenue r ON d.dept_id = r.dept_id ORDER BY d.dept_name, r.year, r.month"}, {"role": "user", "content": "查询各年平均营业额"}, {"role": "assistant", "content": "SELECT year, AVG(revenue_amount) as avg_revenue FROM revenue GROUP BY year"}, {"role": "user", "content": "计算各部门2024年月度增长率"}, {"role": "assistant", "content": "\nWITH monthly_revenue AS (\n    SELECT d.dept_name, r.year, r.month, SUM(r.revenue_amount) as monthly_total\n    FROM departments d \n    JOIN revenue r ON d.dept_id = r.dept_id \n    WHERE r.year = 2024\n    GROUP BY d.dept_name, r.year, r.month\n),\ngrowth_rates AS (\n    SELECT \n        dept_name,\n        month,\n        monthly_total,\n        LAG(monthly_total) OVER (PARTITION BY dept_name ORDER BY month) as prev_month_total,\n        CASE \n            WHEN LAG(monthly_total) OVER (PARTITION BY dept_name ORDER BY month) > 0 \n            THEN (monthly_total - LAG(monthly_total) OVER (PARTITION BY dept_name ORDER BY month)) * 100.0 / LAG(monthly_total) OVER (PARTITION BY dept_name ORDER BY month)\n            ELSE 0 \n        END as growth_rate\n    FROM monthly_revenue\n)\nSELECT dept_name, month, growth_rate FROM growth_rates WHERE growth_rate IS NOT NULL\n"}, {"role": "user", "content": "计算各个部门2024年每月营业额相比上月的增长率,用曲线图,横轴月份,纵轴增长率"}]
Info: Ollama Response:
model='qwen2.5:1.5b' created_at='2025-07-12T04:31:18.470419351Z' done=True done_reason='stop' total_duration=29570517853 load_duration=186983514 prompt_eval_count=1159 prompt_eval_duration=3257042504 eval_count=321 eval_duration=25916463959 message=Message(role='assistant', content='WITH monthly_revenue AS (\n    SELECT d.dept_name, r.year, r.month, SUM(r.revenue_amount) as monthly_total\n    FROM departments d \n    JOIN revenue r ON d.dept_id = r.dept_id \n    WHERE r.year = 2024\n    GROUP BY d.dept_name, r.year, r.month\n),\ngrowth_rates AS (\n    SELECT \n        dept_name,\n        month,\n        monthly_total,\n        LAG(monthly_total) OVER (PARTITION BY dept_name ORDER BY month) as prev_month_total,\n        CASE \n            WHEN LAG(monthly_total) OVER (PARTITION BY dept_name ORDER BY month) > 0 \n            THEN (monthly_total - LAG(monthly_total) OVER (PARTITION BY dept_name ORDER BY month)) * 100.0 / LAG(monthly_total) OVER (PARTITION BY dept_name ORDER BY month)\n            ELSE 0 \n        END as growth_rate\n    FROM monthly_revenue\n),\ngrowth_rates_plot AS (\nSELECT \n    dept_name,\n    month,\n    growth_rate,\n    month - CAST(row_number() over(partition by dept_name order by month) as integer) + 1 as index_num\nFROM growth_rates\n)\nSELECT d.dept_name, r.month, g.growth_rate, g.index_num FROM departments d JOIN growth_rates_plot g ON d.dept_id = g.dept_name AND g.month = (SELECT MAX(month) FROM monthly_revenue WHERE dept_name = d.dept_id) ORDER BY g.index_num', thinking=None, images=None, tool_calls=None)
LLM Response: WITH monthly_revenue AS (
    SELECT d.dept_name, r.year, r.month, SUM(r.revenue_amount) as monthly_total
    FROM departments d 
    JOIN revenue r ON d.dept_id = r.dept_id 
    WHERE r.year = 2024
    GROUP BY d.dept_name, r.year, r.month
),
growth_rates AS (
    SELECT 
        dept_name,
        month,
        monthly_total,
        LAG(monthly_total) OVER (PARTITION BY dept_name ORDER BY month) as prev_month_total,
        CASE 
            WHEN LAG(monthly_total) OVER (PARTITION BY dept_name ORDER BY month) > 0 
            THEN (monthly_total - LAG(monthly_total) OVER (PARTITION BY dept_name ORDER BY month)) * 100.0 / LAG(monthly_total) OVER (PARTITION BY dept_name ORDER BY month)
            ELSE 0 
        END as growth_rate
    FROM monthly_revenue
),
growth_rates_plot AS (
SELECT 
    dept_name,
    month,
    growth_rate,
    month - CAST(row_number() over(partition by dept_name order by month) as integer) + 1 as index_num
FROM growth_rates
)
SELECT d.dept_name, r.month, g.growth_rate, g.index_num FROM departments d JOIN growth_rates_plot g ON d.dept_id = g.dept_name AND g.month = (SELECT MAX(month) FROM monthly_revenue WHERE dept_name = d.dept_id) ORDER BY g.index_num
WITH monthly_revenue AS (
    SELECT d.dept_name, r.year, r.month, SUM(r.revenue_amount) as monthly_total
    FROM departments d 
    JOIN revenue r ON d.dept_id = r.dept_id 
    WHERE r.year = 2024
    GROUP BY d.dept_name, r.year, r.month
),
growth_rates AS (
    SELECT 
        dept_name,
        month,
        monthly_total,
        LAG(monthly_total) OVER (PARTITION BY dept_name ORDER BY month) as prev_month_total,
        CASE 
            WHEN LAG(monthly_total) OVER (PARTITION BY dept_name ORDER BY month) > 0 
            THEN (monthly_total - LAG(monthly_total) OVER (PARTITION BY dept_name ORDER BY month)) * 100.0 / LAG(monthly_total) OVER (PARTITION BY dept_name ORDER BY month)
            ELSE 0 
        END as growth_rate
    FROM monthly_revenue
),
growth_rates_plot AS (
SELECT 
    dept_name,
    month,
    growth_rate,
    month - CAST(row_number() over(partition by dept_name order by month) as integer) + 1 as index_num
FROM growth_rates
)
SELECT d.dept_name, r.month, g.growth_rate, g.index_num FROM departments d JOIN growth_rates_plot g ON d.dept_id = g.dept_name AND g.month = (SELECT MAX(month) FROM monthly_revenue WHERE dept_name = d.dept_id) ORDER BY g.index_num
Couldn't run sql:  Execution failed on sql 'WITH monthly_revenue AS (
    SELECT d.dept_name, r.year, r.month, SUM(r.revenue_amount) as monthly_total
    FROM departments d 
    JOIN revenue r ON d.dept_id = r.dept_id 
    WHERE r.year = 2024
    GROUP BY d.dept_name, r.year, r.month
),
growth_rates AS (
    SELECT 
        dept_name,
        month,
        monthly_total,
        LAG(monthly_total) OVER (PARTITION BY dept_name ORDER BY month) as prev_month_total,
        CASE 
            WHEN LAG(monthly_total) OVER (PARTITION BY dept_name ORDER BY month) > 0 
            THEN (monthly_total - LAG(monthly_total) OVER (PARTITION BY dept_name ORDER BY month)) * 100.0 / LAG(monthly_total) OVER (PARTITION BY dept_name ORDER BY month)
            ELSE 0 
        END as growth_rate
    FROM monthly_revenue
),
growth_rates_plot AS (
SELECT 
    dept_name,
    month,
    growth_rate,
    month - CAST(row_number() over(partition by dept_name order by month) as integer) + 1 as index_num
FROM growth_rates
)
SELECT d.dept_name, r.month, g.growth_rate, g.index_num FROM departments d JOIN growth_rates_plot g ON d.dept_id = g.dept_name AND g.month = (SELECT MAX(month) FROM monthly_revenue WHERE dept_name = d.dept_id) ORDER BY g.index_num': no such column: r.month

4.  结语

通过这段脚本,我们体验了 Vanna AI 从 RAG 训练 → LLM 生成 SQL → 自动可视化 的完整闭环。

  • 数据分析同学再也不用手敲长 SQL,业务同学也能一句话出报表;实现 ChatBI 聊天式商业智能
  • 真正做到端侧私有化部署,All‑in‑One。

遇到生成错误时,可以打印 SQL → 落地执行 → 回写训练集,即可把命中率拉到 90 % 以上。把这套思路迁移到自己公司的数据仓库,解锁更多自然语言 BI 场景!

参考:https://vanna.ai/docs/postgres-openai-standard-chromadb/

本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2025-07-12,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 Michael阿明 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 文章目录
  • 1. 解决的核心问题是什么?
    • 1. 核心问题与解决方案
  • 2. 实现原理
    • 1. RAG 架构:向量检索 + 生成模型
    • 2. 执行与自我学习
    • 3. 可视化输出(可选)
    • 4. 技术优势详析
    • 5. 总结
  • 3. 实战案例
    • 3.1 清理目录 + 构造 SQLite 示例库
    • 3.2 Vanna 入口类:MyVanna
    • 3.3 训练三件套
    • 3.4 ask():自然语言 → SQL → DataFrame [→ Plotly]
  • 4.  结语
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档