本文提供了一个通过CLI和Python API来解析文件并从中创建llms.txt。输入文件应遵循以下格式:
markdown
# FastHTML
> FastHTML is a python library which...
When writing FastHTML apps remember to:
- Thing to remember
## Docs
- [Surreal](https://host/README.md): Tiny jQuery alternative with Locality of Behavior
- [FastHTML quick start](https://host/quickstart.html.md): An overview of FastHTML features
## Examples
- [Todo app](https://host/adv_app.py)
## Optional
- [Starlette docs](https://host/starlette-sml.md): A subset of the Starlette docs
安装
pip install llms-txt
如何使用
CLI
安装后,llms_txt2ctx
在您的终端中可用。
要获取CLI的帮助:
llms_txt2ctx -h
要将llms.txt
文件到XML上下文并保存到llms.md
:
llms_txt2ctx llms.txt > llms.md
通过--optional True
添加输入文件的 "选项" 部分。
Python模块
javascript
from llms_txt import *
ini
samp = Path('llms-sample.txt').read_text()
使用parse_llms_file
使用llms.txt文件的部分创建数据结构 (还可以添加optional=True
如果需要):
scss
parsed = parse_llms_file(samp)
list(parsed)
css
['title', 'summary', 'info', 'sections']
parsed.title,parsed.summary
vbnet
('FastHTML',
'FastHTML is a python library which brings together Starlette, Uvicorn, HTMX, and fastcore's `FT` "FastTags" into a library for creating server-rendered hypermedia applications.')
scss
list(parsed.sections)
css
['Docs', 'Examples', 'Optional']
css
parsed.sections.Optional[0]
rust
{ 'desc': 'A subset of the Starlette documentation useful for FastHTML '
'development.',
'title': 'Starlette full documentation',
'url': 'https://gist.githubusercontent.com/jph00/809e4a4808d4510be0e3dc9565e9cbd3/raw/9b717589ca44cedc8aaf00b2b8cacef922964c0f/starlette-sml.md'}
使用create_ctx
创建一个包含XML部分的LLM上下文文件,适用于Claude等系统 (这是CLI在幕后调用的)。
ini
ctx = create_ctx(samp)
scss
print(ctx[:300])
vbnet
<project title="FastHTML" summary='FastHTML is a python library which brings together Starlette, Uvicorn, HTMX, and fastcore's `FT` "FastTags" into a library for creating server-rendered hypermedia applications.'>
Remember:
- Use `serve()` for running uvicorn (`if __name__ == "__main__"` is not
测试和部署
显示解析有多简单llms.txt
文件,这里是一个完整的解析器,在 <20行代码中,没有依赖关系:
python
from pathlib import Path
import re,itertools
def chunked(it, chunk_sz):
it = iter(it)
return iter(lambda: list(itertools.islice(it, chunk_sz)), [])
def parse_llms_txt(txt):
"Parse llms.txt file contents in `txt` to a `dict`"
def _p(links):
link_pat = '-\s*[(?P<title>[^]]+)]((?P<url>[^)]+))(?::\s*(?P<desc>.*))?'
return [re.search(link_pat, l).groupdict()
for l in re.split(r'\n+', links.strip()) if l.strip()]
start,*rest = re.split(fr'^##\s*(.*?$)', txt, flags=re.MULTILINE)
sects = {k: _p(v) for k,v in dict(chunked(rest, 2)).items()}
pat = '^#\s*(?P<title>.+?$)\n+(?:^>\s*(?P<summary>.+?$)$)?\n+(?P<info>.*)'
d = re.search(pat, start.strip(), (re.MULTILINE|re.DOTALL)).groupdict()
d['sections'] = sects
return d
我们提供了一个测试套件tests/test-parse.py
并确认此实现通过了所有测试。