使用用户帐户凭据从 BigQuery 读取数据
In [1]: import pandas as pd
要在 BigQuery 中运行查询,你需要拥有自己的 BigQuery 项目。我们可以请求一些公共样本数据:
In [2]: data = pd.read_gbq('''SELECT title, id, num_characters
...: FROM [publicdata:samples.wikipedia]
...: LIMIT 5'''
...: , project_id='<your-project-id>')
这将打印出来:
Your browser has been opened to visit:
https://accounts.google.com/o/oauth2/v2/auth...[looong url cutted]
If your browser is on a different machine then exit and re-run this
application with the command-line parameter
--noauth_local_webserver
如果你使用的是本地计算机,则会弹出浏览器。授予权限后,pandas 将继续输出:
Authentication successful.
Requesting query... ok.
Query running...
Query done.
Processed: 13.8 Gb
Retrieving results...
Got 5 rows.
Total time taken 1.5 s.
Finished at 2016-08-23 11:26:03.
结果:
In [3]: data
Out[3]:
title id num_characters
0 Fusidic acid 935328 1112
1 Clark Air Base 426241 8257
2 Watergate scandal 52382 25790
3 2005 35984 75813
4 .BLP 2664340 1659
作为副作用,pandas 将创建 json 文件 bigquery_credentials.dat
,这将允许你运行更多查询,而无需再授予权限:
In [9]: pd.read_gbq('SELECT count(1) cnt FROM [publicdata:samples.wikipedia]'
, project_id='<your-project-id>')
Requesting query... ok.
[rest of output cutted]
Out[9]:
cnt
0 313797035