首页
学习
活动
专区
圈层
工具
发布
社区首页 >专栏 >代码实战 | ECMWF官方出品,一个Python库通吃GRIB、NetCDF等气象数据格式

代码实战 | ECMWF官方出品,一个Python库通吃GRIB、NetCDF等气象数据格式

作者头像
用户11172986
发布2026-06-08 14:44:43
发布2026-06-08 14:44:43
480
举报
文章被收录于专栏:气python风雨气python风雨

代码实战 | ECMWF官方出品,一个Python库通吃GRIB、NetCDF等气象数据格式

我们之前介绍过含有多种插值方法的earthkit-grid,今天我们说一下它的bro,用于获取与读数据的erathkit-data。

earthkit-data 是由 ECMWF(欧洲中期天气预报中心)主导开发的开源 Python 库,专注于气象与气候科学领域的数据访问与处理。

它最大的特点是 format-agnostic(格式无关):同一套 API 可以同时处理 GRIB、NetCDF、BUFR 等多种格式,而无需关心底层细节。

核心设计理念包括:

  • 统一入口:所有数据源都通过 from_source() 方法加载
  • 惰性加载(Lazy Loading):默认只将当前需要的字段保留在内存中,轻松处理超大文件
  • 流式支持:URL 和 FDB 等来源可直接流式读取,无需先落盘
  • 缓存机制:远程数据可自动缓存到本地,支持高度自定义
  • 插件扩展:用户可以自定义数据源并作为插件发布
  • GPU Ready:底层设计已考虑 GPU 计算,未来计划支持 PyTorch 张量
image
image

这是官网的教程列表,可见已支持了大多数气象数据的格式

以下内容仅测试读取gfs的grib数据,其余数据请自行测试

代码语言:javascript
复制
!pip install earthkit-data -i https://pypi.mirrors.ustc.edu.cn/simple/
代码语言:javascript
复制
Looking in indexes: https://pypi.mirrors.ustc.edu.cn/simple/
Requirement already satisfied: earthkit-data in /opt/conda/lib/python3.11/site-packages (0.20.0)
Requirement already satisfied: cfgrib>=0.9.10.1 in /opt/conda/lib/python3.11/site-packages (from earthkit-data) (0.9.14.1)
Requirement already satisfied: dask in /opt/conda/lib/python3.11/site-packages (from earthkit-data) (2024.8.1)
Requirement already satisfied: deprecation in /opt/conda/lib/python3.11/site-packages (from earthkit-data) (2.1.0)
Requirement already satisfied: earthkit-utils<0.99,>=0.2 in /opt/conda/lib/python3.11/site-packages (from earthkit-data) (0.3.0)
Requirement already satisfied: eccodes>=1.7 in /opt/conda/lib/python3.11/site-packages (from earthkit-data) (2.47.0)
Requirement already satisfied: entrypoints in /opt/conda/lib/python3.11/site-packages (from earthkit-data) (0.4)
Requirement already satisfied: filelock in /opt/conda/lib/python3.11/site-packages (from earthkit-data) (3.29.1)
Requirement already satisfied: jinja2 in /opt/conda/lib/python3.11/site-packages (from earthkit-data) (3.1.4)
Requirement already satisfied: jsonschema in /opt/conda/lib/python3.11/site-packages (from earthkit-data) (4.23.0)
Requirement already satisfied: lru-dict in /opt/conda/lib/python3.11/site-packages (from earthkit-data) (1.4.1)
Requirement already satisfied: markdown in /opt/conda/lib/python3.11/site-packages (from earthkit-data) (3.6)
Requirement already satisfied: multiurl>=0.3.3 in /opt/conda/lib/python3.11/site-packages (from earthkit-data) (0.3.7)
Requirement already satisfied: netcdf4 in /opt/conda/lib/python3.11/site-packages (from earthkit-data) (1.6.3)
Requirement already satisfied: pandas in /opt/conda/lib/python3.11/site-packages (from earthkit-data) (2.2.3)
Requirement already satisfied: pdbufr>=0.11 in /opt/conda/lib/python3.11/site-packages (from earthkit-data) (0.14.2)
Requirement already satisfied: pyyaml in /opt/conda/lib/python3.11/site-packages (from earthkit-data) (6.0.2)
Requirement already satisfied: tqdm>=4.63 in /opt/conda/lib/python3.11/site-packages (from earthkit-data) (4.67.0)
Requirement already satisfied: xarray>=0.19 in /opt/conda/lib/python3.11/site-packages (from earthkit-data) (2024.3.0)
Requirement already satisfied: attrs>=19.2 in /opt/conda/lib/python3.11/site-packages (from cfgrib>=0.9.10.1->earthkit-data) (24.3.0)
Requirement already satisfied: click in /opt/conda/lib/python3.11/site-packages (from cfgrib>=0.9.10.1->earthkit-data) (8.1.7)
Requirement already satisfied: numpy in /opt/conda/lib/python3.11/site-packages (from cfgrib>=0.9.10.1->earthkit-data) (1.26.4)
Requirement already satisfied: array-api-compat in /opt/conda/lib/python3.11/site-packages (from earthkit-utils<0.99,>=0.2->earthkit-data) (1.14.0)
Requirement already satisfied: pint in /opt/conda/lib/python3.11/site-packages (from earthkit-utils<0.99,>=0.2->earthkit-data) (0.24.4)
Requirement already satisfied: cffi in /opt/conda/lib/python3.11/site-packages (from eccodes>=1.7->earthkit-data) (1.17.1)
Requirement already satisfied: findlibs in /opt/conda/lib/python3.11/site-packages (from eccodes>=1.7->earthkit-data) (0.0.5)
Requirement already satisfied: eccodeslib in /opt/conda/lib/python3.11/site-packages (from eccodes>=1.7->earthkit-data) (2.47.1.20)
Requirement already satisfied: requests in /opt/conda/lib/python3.11/site-packages (from multiurl>=0.3.3->earthkit-data) (2.32.3)
Requirement already satisfied: pytz in /opt/conda/lib/python3.11/site-packages (from multiurl>=0.3.3->earthkit-data) (2024.1)
Requirement already satisfied: python-dateutil in /opt/conda/lib/python3.11/site-packages (from multiurl>=0.3.3->earthkit-data) (2.9.0.post0)
Requirement already satisfied: packaging>=22 in /opt/conda/lib/python3.11/site-packages (from xarray>=0.19->earthkit-data) (24.1)
Requirement already satisfied: tzdata>=2022.7 in /opt/conda/lib/python3.11/site-packages (from pandas->earthkit-data) (2024.2)
Requirement already satisfied: cloudpickle>=3.0.0 in /opt/conda/lib/python3.11/site-packages (from dask->earthkit-data) (3.1.0)
Requirement already satisfied: fsspec>=2021.09.0 in /opt/conda/lib/python3.11/site-packages (from dask->earthkit-data) (2025.2.0)
Requirement already satisfied: partd>=1.4.0 in /opt/conda/lib/python3.11/site-packages (from dask->earthkit-data) (1.4.2)
Requirement already satisfied: toolz>=0.10.0 in /opt/conda/lib/python3.11/site-packages (from dask->earthkit-data) (1.0.0)
Requirement already satisfied: importlib-metadata>=4.13.0 in /opt/conda/lib/python3.11/site-packages (from dask->earthkit-data) (8.5.0)
Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/lib/python3.11/site-packages (from jinja2->earthkit-data) (3.0.2)
Requirement already satisfied: jsonschema-specifications>=2023.03.6 in /opt/conda/lib/python3.11/site-packages (from jsonschema->earthkit-data) (2024.10.1)
Requirement already satisfied: referencing>=0.28.4 in /opt/conda/lib/python3.11/site-packages (from jsonschema->earthkit-data) (0.35.1)
Requirement already satisfied: rpds-py>=0.7.1 in /opt/conda/lib/python3.11/site-packages (from jsonschema->earthkit-data) (0.22.3)
Requirement already satisfied: cftime in /opt/conda/lib/python3.11/site-packages (from netcdf4->earthkit-data) (1.6.4)
Requirement already satisfied: zipp>=3.20 in /opt/conda/lib/python3.11/site-packages (from importlib-metadata>=4.13.0->dask->earthkit-data) (3.21.0)
Requirement already satisfied: locket in /opt/conda/lib/python3.11/site-packages (from partd>=1.4.0->dask->earthkit-data) (1.0.0)
Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.11/site-packages (from python-dateutil->multiurl>=0.3.3->earthkit-data) (1.17.0)
Requirement already satisfied: pycparser in /opt/conda/lib/python3.11/site-packages (from cffi->eccodes>=1.7->earthkit-data) (2.22)
Requirement already satisfied: eckitlib==2.0.7.20 in /opt/conda/lib/python3.11/site-packages (from eccodeslib->eccodes>=1.7->earthkit-data) (2.0.7.20)
Requirement already satisfied: platformdirs>=2.1.0 in /opt/conda/lib/python3.11/site-packages (from pint->earthkit-utils<0.99,>=0.2->earthkit-data) (4.3.6)
Requirement already satisfied: typing_extensions>=4.0.0 in /opt/conda/lib/python3.11/site-packages (from pint->earthkit-utils<0.99,>=0.2->earthkit-data) (4.12.2)
Requirement already satisfied: flexcache>=0.3 in /opt/conda/lib/python3.11/site-packages (from pint->earthkit-utils<0.99,>=0.2->earthkit-data) (0.3)
Requirement already satisfied: flexparser>=0.4 in /opt/conda/lib/python3.11/site-packages (from pint->earthkit-utils<0.99,>=0.2->earthkit-data) (0.4)
Requirement already satisfied: charset-normalizer<4,>=2 in /opt/conda/lib/python3.11/site-packages (from requests->multiurl>=0.3.3->earthkit-data) (3.4.0)
Requirement already satisfied: idna<4,>=2.5 in /opt/conda/lib/python3.11/site-packages (from requests->multiurl>=0.3.3->earthkit-data) (3.10)
Requirement already satisfied: urllib3<3,>=1.21.1 in /opt/conda/lib/python3.11/site-packages (from requests->multiurl>=0.3.3->earthkit-data) (2.2.3)
Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.11/site-packages (from requests->multiurl>=0.3.3->earthkit-data) (2024.12.14)

1. 安装

环境要求 Python ≥ 3.10

基础安装仅包含核心功能,建议根据需求选择以下方式:

代码语言:javascript
复制
# 基础安装
# !pip install earthkit-data

# 安装全部可选依赖(推荐,支持 GRIB/NetCDF/BUFR/FDB 等所有格式)
# !pip install earthkit-data[all]

# 或按需安装特定格式支持
# !pip install earthkit-data[grib]      # GRIB 格式
# !pip install earthkit-data[netcdf]    # NetCDF 格式
# !pip install earthkit-data[bufr]      # BUFR 格式
# !pip install earthkit-data[fdb]       # FDB (Fields DataBase) 支持
# !pip install earthkit-data[polytope]  # Polytope API 支持

2. 核心入口:from_source

earthkit.data.from_source(<type>, <args>) 是整个库最核心的入口方法。通过统一的方式从各种来源加载数据。

支持的常见 type 包括:

类型

说明

示例场景

sample

内置示例数据

快速体验、测试

file

本地文件

读取本地 GRIB/NetCDF

url

远程 URL

下载在线数据,支持 tar.gz 自动解压

stream

数据流

处理标准输入或网络流

fdb

ECMWF FDB

从 Fields DataBase 读取

mars

MARS 服务

通过 ECMWF MARS 远程访问

cds

CDS API

从 Copernicus Climate Data Store 获取

polytope

Polytope

通过 Polytope 服务访问数据

dummy-source

虚拟数据源

生成测试用的假数据

代码语言:javascript
复制
import earthkit.data as ekd

ds = ekd.from_source("file", "/home/mw/input/GFS1824/gfs_4_20230902_0000_021.grb2")
print("类型:", type(ds))
print("字段数:", len(ds))
代码语言:javascript
复制
类型: <class 'earthkit.data.readers.grib.file.GRIBReader'>
字段数: 743
代码语言:javascript
复制
/opt/conda/lib/python3.11/site-packages/gribapi/__init__.py:23: UserWarning: ecCodes 2.42.0 or higher is recommended. You are running version 2.29.0
  warnings.warn(
代码语言:javascript
复制
# 从 URL 加载(支持 tar.gz 自动解压)
url = "https://get.ecmwf.int/repository/test-data/earthkit-data/examples/test_gribs.tar.gz"
ds_url = ekd.from_source("url", url)
print("从 URL 加载后的类型:", type(ds_url))
print("字段数:", len(ds_url))
代码语言:javascript
复制
test_gribs.tar.gz:   0%|          | 0.00/463k [00:00<?, ?B/s]
代码语言:javascript
复制
  0%|          | 0/2 [00:00<?, ?it/s]
代码语言:javascript
复制
从 URL 加载后的类型: <class 'earthkit.data.readers.grib.index.GribMultiFieldList'>
字段数: 6

3. 元数据查看

GRIB 和适当的 NetCDF 文件在 earthkit-data 中被表示为 FieldList(字段列表),每个元素是一个 Field(水平切片,代表某一时刻、某一层的某个气象变量)。

以下方法可以快速了解数据的整体概况:

代码语言:javascript
复制
# 表格化快速摘要
ds.ls()
image
image

image

代码语言:javascript
复制
# 按 param 分组的详细描述
ds.describe()
image
image

image

4. 数据筛选与切片

FieldList 提供了类似 xarray 的 sel() 方法,可以按元数据(如参数名、层级、时效步长等)进行筛选。

代码语言:javascript
复制
# 按参数名筛选(param 是 GRIB 中常用的关键字)
t2m = ds.sel(param="2t")
print("筛选后字段数:", len(t2m))
t2m.ls()
代码语言:javascript
复制
筛选后字段数: 1
image
image

image

代码语言:javascript
复制
# 多条件筛选
subset = ds.sel(param="cin", levelist=9000)
print("多条件筛选后字段数:", len(subset))
subset.ls()
image
image

image

代码语言:javascript
复制
多条件筛选后字段数: 1
代码语言:javascript
复制
# 按元数据键排序
ordered = ds.order_by("param", "levelist")
print("排序后前5条:")
ordered.ls()
代码语言:javascript
复制
排序后前5条:
image
image

image

代码语言:javascript
复制
# 切片:取前3条
first3 = ds[:3]
print("切片后字段数:", len(first3))
first3.ls()
代码语言:javascript
复制
切片后字段数: 3
image
image

image

5. 单条 Field 的访问与操作

可以通过索引访问单个字段,查看其详细元数据或导出数据。

代码语言:javascript
复制
# 取第一条
f = ds[0]
print("类型:", type(f))
print("参数名:", f.metadata("param"))
print("层级:", f.metadata("levelist"))
print("时效步长:", f.metadata("step"))
代码语言:javascript
复制
类型: <class 'earthkit.data.readers.grib.codes.GribField'>
参数名: prmsl
层级: 0
时效步长: 21
代码语言:javascript
复制
# 完整 dump 所有元数据(namespace 常用 mars 或 default)
f.dump(namespace="mars")

6. 数据转换

earthkit-data 提供了多种常用导出接口,让你可以无缝衔接到 xarraypandasnumpy 等生态工具中。

代码语言:javascript
复制
# 转为 xarray Dataset(气象分析最常用)
xa = subset.to_xarray()
print(xa)
代码语言:javascript
复制
<xarray.Dataset> Size: 2MB
Dimensions:    (latitude: 361, longitude: 720)
Coordinates:
  * latitude   (latitude) float64 3kB 90.0 89.5 89.0 88.5 ... -89.0 -89.5 -90.0
  * longitude  (longitude) float64 6kB 0.0 0.5 1.0 1.5 ... 358.5 359.0 359.5
Data variables:
    cin        (latitude, longitude) float64 2MB ...
Attributes:
    param:        cin
    paramId:      228001
    levtype:      unknown
    date:         20230902
    time:         0
    levelist:     9000
    Conventions:  CF-1.8
    institution:  ECMWF
代码语言:javascript
复制
xa['cin'].plot()
代码语言:javascript
复制
<matplotlib.collections.QuadMesh at 0x7f2bd921c290>
output
output

output

多变量数据转换还是有点问题,等待进一步升级吧

7. earthkit 生态系统简介

earthkit-data 只是 earthkit 大家族的一员。完整的 earthkit 生态还包括:

子包

功能

earthkit-data

数据访问与格式处理(本 notebook 主题)

earthkit-maps

地图投影与地理信息处理

earthkit-plots

气象数据可视化

earthkit-regrid

网格插值与重采样

earthkit-transforms

数据变换与统计计算

earthkit-climate

气候学分析与计算

这些包共同构成了一套 从数据获取 → 处理 → 分析 → 可视化 的完整气象科学工作流工具链。

8. 小结与注意事项

总的来说这个工具比较工业化,适合做项目一条龙或者海量数据处理。

核心要点

  • from_source() 是唯一的加载入口,屏蔽了底层格式差异
  • FieldList / Field 是核心抽象,提供统一的元数据访问和数据导出接口
  • sel() / order_by() / slice 让数据筛选变得非常直观
  • to_xarray() / to_numpy() / to_pandas() 让你可以无缝接入现有的科学计算生态
  • 默认惰性加载 + 流式读取意味着你可以处理远超内存容量的数据集

⚠️ 版本说明

截至当前,earthkit-data 仍处于 Release Candidate 阶段,尚未发布稳定的 1.0 版本。API 可能在最终版前发生变动,生产环境使用时建议关注官方 Migration Guide。

参考链接

  • GitHub: https://github.com/ecmwf/earthkit-data
  • 文档: https://earthkit-data.readthedocs.io/en/latest/
  • earthkit 总览: https://earthkit.readthedocs.io/en/latest/
  • ECMWF 官方介绍: https://www.ecmwf.int/en/newsletter/179/computing/introducing-earthkit
本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2026-06-06,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 气python风雨 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 代码实战 | ECMWF官方出品,一个Python库通吃GRIB、NetCDF等气象数据格式
    • 1. 安装
    • 2. 核心入口:from_source
    • 3. 元数据查看
    • 4. 数据筛选与切片
    • 5. 单条 Field 的访问与操作
    • 6. 数据转换
    • 7. earthkit 生态系统简介
    • 8. 小结与注意事项
      • 核心要点
      • ⚠️ 版本说明
      • 参考链接
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档