金融 - findpapers:论文搜索与下载工具
findpapers:论文搜索与下载工具
findpapers search search.json --query "[Deep Learning] AND [Knowledge Graph] AND ([Quantitative Investment] OR [Algorithmic Trading] OR [Financial Analysis] OR [Risk Assessment] OR [Economic Cycle] OR [Business Cycle])" --databases "arxiv,ssrn,repec,econbiz,semanticscholar" --limit-db 40 --verbose
这段代码是一个使用 findpapers工具,在五个专业库中(arxiv,ssrn,repec,econbiz,semanticscholar),进行一定逻辑条件的,学术论文搜索的命令。
其中
findpapers search search_broad.json --query "[...]" --databases "arxiv,pubmed" --limit-db 40 --verbose
该命令通过 findpapers工具从"arxiv,ssrn,repec,econbiz,semanticscholar"数据库中检索符合如下指定关键词组合
"[Deep Learning] AND [Knowledge Graph] AND ([Quantitative Investment] OR [Algorithmic Trading] OR [Financial Analysis] OR [Risk Assessment] OR [Economic Cycle] OR [Business Cycle])"
的学术论文,并将结果保存到 search_broad.json文件中。
参数说明如下:
完成后,有类似如下整理好的搜索结果(以下是单篇备选文献的结果),
{
"databases": [
"arxiv",
"ssrn",
"repec",
"econbiz",
"semanticscholar"
],
"limit": null,
"limit_per_database": 40,
"number_of_papers": 1,
"number_of_papers_by_database": {
"arXiv": 1
},
"papers": [
{
"abstract": "Knowledge Graphs have emerged as a compelling abstraction for capturing key\nrelationship among the entities of interest to enterprises and for integrating\ndata from heterogeneous sources. JPMorgan Chase (JPMC) is leading this trend by\nleveraging knowledge graphs across the organization for multiple mission\ncritical applications such as risk assessment, fraud detection, investment\nadvice, etc. A core problem in leveraging a knowledge graph is to link mentions\n(e.g., company names) that are encountered in textual sources to entities in\nthe knowledge graph. Although several techniques exist for entity linking, they\nare tuned for entities that exist in Wikipedia, and fail to generalize for the\nentities that are of interest to an enterprise. In this paper, we propose a\nnovel end-to-end neural entity linking model (JEL) that uses minimal context\ninformation and a margin loss to generate entity embeddings, and a Wide & Deep\nLearning model to match character and semantic information respectively. We\nshow that JEL achieves the state-of-the-art performance to link mentions of\ncompany names in financial news with entities in our knowledge graph. We report\non our efforts to deploy this model in the company-wide system to generate\nalerts in response to financial news. The methodology used for JEL is directly\napplicable and usable by other enterprises who need entity linking solutions\nfor data that are unique to their respective situations.",
"authors": [
"Wanying Ding",
"Vinay K. Chaudhri",
"Naren Chittar",
"Krishna Konakanchi"
],
"categories": {},
"citations": null,
"comments": "8 pages, 4 figures, IAAI-21",
"databases": [
"arXiv"
],
"doi": "10.1609/aaai.v35i17.17796",
"keywords": [],
"number_of_pages": null,
"pages": null,
"publication": null,
"publication_date": "2024-11-05",
"selected": true,
"title": "JEL: Applying End-to-End Neural Entity Linking in JPMorgan Chase",
"urls": [
"http://arxiv.org/abs/2411.02695v1",
"http://arxiv.org/pdf/2411.02695v1",
"http://dx.doi.org/10.1609/aaai.v35i17.17796"
]
}
],
"processed_at": "2025-10-08 07:39:04",
"publication_types": null,
"query": "[Deep Learning] AND [Knowledge Graph] AND ([Quantitative Investment] OR [Algorithmic Trading] OR [Financial Analysis] OR [Risk Assessment] OR [Economic Cycle] OR [Business Cycle])",
"since": null,
"until": null
}
搜索完成后只搜到了1篇文献,所以需要放宽一下约束条件(不局限于深度学习,包括机器学习),并限定专业库(更贴合金融量化投资需求的库)
findpapers search search_broad.json --query "([Machine Learning] OR [Deep Learning] OR [Knowledge Graph]) AND ([Quantitative Investment] OR [Algorithmic Trading] OR [Financial Analysis] OR [Risk Assessment] OR [Finance] OR [Investment])" --databases "arxiv,semanticscholar" --limit-db 40 --since 2020-01-01 --verbose
搜索完成,要执行如下预选精炼:
findpapers refine search_broad.json
精炼过程每一篇均要选择是否保留。
结束之后,执行如下代码进行论文下载:
findpapers download search_broad.json ./papers_broad --selected --verbose
执行命令后,论文逐步下载,虽然速度较慢(36篇文献的下载耗时约1小时)。