(2024-12-01) Trying LLM for my Local Notes and Ebooks
Trying LLM for my Local Notes and Ebooks
- I may also try to revive my CoachBot stuff, use a longer memory just for kicks.
What do I want to include
- many of my ebooks, read and un-read
- both my digital gardens
- liked/not-yet-posted Instapaper articles
- unread Instapaper saves?
Why? What's the target outcome/benefit/UI?
- conversational UI to get answers based on good/smart references (and ideally, point to the origins)
- inspiration to read at least parts of books I haven't touched
Probably taking RAG approach, not (2023-02-09) Willison Training Nanogpt Entirely On Content From My Blog.
Start out trying approach at (2024-06-24) Adding ChatGPT-Like functionality to MacOS Spotlight Search, decide to wait/read some more...., Spotlight giving me a whole book doesn't help, it's too big to pass along.
- see bottom of (2023-09-04) Willison Llm Now Provides Tools For Working With Embeddings for thoughts on chunking long documents.
- But Spotlight can still be the search engine I use.
Will probably install Simon Willison's LLM library first...
Dec04
- make
/py3/genai/
directory - create
.venv
inside; activate python -m pip install llm
- I don't have any paid accounts anymore: what's most cost-effective for this use?
- I don't have a sense of what will cost a token. If I chunk an ebook by the paragraph, is it 1 token/paragraph?
- my MacBookPro is from 2019, has Intel i7 (w 16GB), am I going to be able to use anything local?
- Simon says: I expect that should run the 8B or 3B models OK, I recommend trying Ollama
- this "selector" recommends:
- Llama-3.2-90B-Vision
- Llama-3.1-70B
Dec07
- go to Ollama site, download/install Mac app, launch, run
ollama run llama3.2
in terminal, do a tiny chat, it works. - install Simon's plugin
llm install llm-ollama
- the
llama3.2
I installed is the 3B model, so I'll stick with that. llm -m llama3.2 'How much is 2+2?'
works- moving to embedding instructions at (2023-09-04) Willison Llm Now Provides Tools For Working With Embeddings
llm install llm-sentence-transformers
- pick an embedding model:
mxbai-embed-large
because it has 10x the downloads of any others llm sentence-transformers register mxbai-embed-large
A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.1.3 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.
If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.
pip install -U numpy
Requirement already satisfied: numpy in ./.venv/lib/python3.11/site-packages (2.1.3)
- something led me to https://newreleases.io/project/pypi/sentence-transformers/release/3.1.1 This patch release fixes hard negatives mining for models that don't automatically normalize their embeddings and it lifts the numpy<2 restriction that was previously required. OK.
pip install sentence-transformers[train]==3.1.1
zsh: no matches found: sentence-transformers[train]==3.1.1
llm install sentence-transformers[train]==3.1.1
-> same thing- hmm, I see sentence-transformers is at 3.3.1 ->
llm install sentence-transformers[train]==3.3.1
.
zsh: no matches found: sentence-transformers[train]==3.3.1
- ok how do I figure out which module is the issue?
- simon says do
llm install llm-python
->
A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.1.3 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.
If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.
Traceback (most recent call last): File "/Users/billseitz/Documents/code/py3/genai/.venv/bin/llm", line 5, in <module>
from llm.cli import cli
File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/llm/cli.py", line 1887, in <module>
load_plugins()
File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/llm/plugins.py", line 25, in load_plugins
pm.load_setuptools_entrypoints("llm")
File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/pluggy/_manager.py", line 421, in load_setuptools_entrypoints
plugin = ep.load()
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/importlib/metadata/__init__.py", line 202, in load
module = import_module(match.group('module'))
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/llm_sentence_transformers.py", line 2, in <module>
from sentence_transformers import SentenceTransformer
File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/sentence_transformers/__init__.py", line 9, in <module>
from sentence_transformers.backend import (
File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/sentence_transformers/backend.py", line 11, in <module>
from sentence_transformers.util import disable_datasets_caching, is_datasets_available
File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/sentence_transformers/util.py", line 17, in <module>
import torch
File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/torch/__init__.py", line 1477, in <module>
from .functional import * # noqa: F403
File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/torch/functional.py", line 9, in <module>
import torch.nn.functional as F
File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/torch/nn/__init__.py", line 1, in <module>
from .modules import * # noqa: F403
File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/torch/nn/modules/__init__.py", line 35, in <module>
from .transformer import TransformerEncoder, TransformerDecoder, \
File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/torch/nn/modules/transformer.py", line 20, in <module>
device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'),
/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/utils/tensor_numpy.cpp:84.)
device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'),
Collecting llm-python
Downloading llm_python-0.1-py3-none-any.whl.metadata (3.3 kB)
Requirement already satisfied: llm in ./.venv/lib/python3.11/site-packages (from llm-python) (0.19)
Requirement already satisfied: click in ./.venv/lib/python3.11/site-packages (from llm->llm-python) (8.1.7)
Requirement already satisfied: openai>=1.0 in ./.venv/lib/python3.11/site-packages (from llm->llm-python) (1.56.2)
Requirement already satisfied: click-default-group>=1.2.3 in ./.venv/lib/python3.11/site-packages (from llm->llm-python) (1.2.4)
Requirement already satisfied: sqlite-utils>=3.37 in ./.venv/lib/python3.11/site-packages (from llm->llm-python) (3.38)
Requirement already satisfied: sqlite-migrate>=0.1a2 in ./.venv/lib/python3.11/site-packages (from llm->llm-python) (0.1b0)
Requirement already satisfied: pydantic>=1.10.2 in ./.venv/lib/python3.11/site-packages (from llm->llm-python) (2.10.3)
Requirement already satisfied: PyYAML in ./.venv/lib/python3.11/site-packages (from llm->llm-python) (6.0.2)
Requirement already satisfied: pluggy in ./.venv/lib/python3.11/site-packages (from llm->llm-python) (1.5.0)
Requirement already satisfied: python-ulid in ./.venv/lib/python3.11/site-packages (from llm->llm-python) (3.0.0)
Requirement already satisfied: setuptools in ./.venv/lib/python3.11/site-packages (from llm->llm-python) (65.5.0)
Requirement already satisfied: pip in ./.venv/lib/python3.11/site-packages (from llm->llm-python) (24.0)
Requirement already satisfied: puremagic in ./.venv/lib/python3.11/site-packages (from llm->llm-python) (1.28)
Requirement already satisfied: anyio<5,>=3.5.0 in ./.venv/lib/python3.11/site-packages (from openai>=1.0->llm->llm-python) (4.6.2.post1)
Requirement already satisfied: distro<2,>=1.7.0 in ./.venv/lib/python3.11/site-packages (from openai>=1.0->llm->llm-python) (1.9.0)
Requirement already satisfied: httpx<1,>=0.23.0 in ./.venv/lib/python3.11/site-packages (from openai>=1.0->llm->llm-python) (0.27.2)
Requirement already satisfied: jiter<1,>=0.4.0 in ./.venv/lib/python3.11/site-packages (from openai>=1.0->llm->llm-python) (0.8.0)
Requirement already satisfied: sniffio in ./.venv/lib/python3.11/site-packages (from openai>=1.0->llm->llm-python) (1.3.1)
Requirement already satisfied: tqdm>4 in ./.venv/lib/python3.11/site-packages (from openai>=1.0->llm->llm-python) (4.67.1)
Requirement already satisfied: typing-extensions<5,>=4.11 in ./.venv/lib/python3.11/site-packages (from openai>=1.0->llm->llm-python) (4.12.2)
Requirement already satisfied: annotated-types>=0.6.0 in ./.venv/lib/python3.11/site-packages (from pydantic>=1.10.2->llm->llm-python) (0.7.0)
Requirement already satisfied: pydantic-core==2.27.1 in ./.venv/lib/python3.11/site-packages (from pydantic>=1.10.2->llm->llm-python) (2.27.1)
Requirement already satisfied: sqlite-fts4 in ./.venv/lib/python3.11/site-packages (from sqlite-utils>=3.37->llm->llm-python) (1.0.3)
Requirement already satisfied: tabulate in ./.venv/lib/python3.11/site-packages (from sqlite-utils>=3.37->llm->llm-python) (0.9.0)
Requirement already satisfied: python-dateutil in ./.venv/lib/python3.11/site-packages (from sqlite-utils>=3.37->llm->llm-python) (2.9.0.post0)
Requirement already satisfied: idna>=2.8 in ./.venv/lib/python3.11/site-packages (from anyio<5,>=3.5.0->openai>=1.0->llm->llm-python) (3.10)
Requirement already satisfied: certifi in ./.venv/lib/python3.11/site-packages (from httpx<1,>=0.23.0->openai>=1.0->llm->llm-python) (2024.8.30)
Requirement already satisfied: httpcore==1.* in ./.venv/lib/python3.11/site-packages (from httpx<1,>=0.23.0->openai>=1.0->llm->llm-python) (1.0.7)
Requirement already satisfied: h11<0.15,>=0.13 in ./.venv/lib/python3.11/site-packages (from httpcore==1.*->httpx<1,>=0.23.0->openai>=1.0->llm->llm-python) (0.14.0)
Requirement already satisfied: six>=1.5 in ./.venv/lib/python3.11/site-packages (from python-dateutil->sqlite-utils>=3.37->llm->llm-python) (1.17.0)
Downloading llm_python-0.1-py3-none-any.whl (7.2 kB)
Installing collected packages: llm-python
Successfully installed llm-python-0.1
llm python -m pip freeze
->
A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.1.3 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.
If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.
Traceback (most recent call last): File "/Users/billseitz/Documents/code/py3/genai/.venv/bin/llm", line 5, in <module>
from llm.cli import cli
File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/llm/cli.py", line 1887, in <module>
load_plugins()
File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/llm/plugins.py", line 25, in load_plugins
pm.load_setuptools_entrypoints("llm")
File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/pluggy/_manager.py", line 421, in load_setuptools_entrypoints
plugin = ep.load()
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/importlib/metadata/__init__.py", line 202, in load
module = import_module(match.group('module'))
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/llm_sentence_transformers.py", line 2, in <module>
from sentence_transformers import SentenceTransformer
File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/sentence_transformers/__init__.py", line 9, in <module>
from sentence_transformers.backend import (
File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/sentence_transformers/backend.py", line 11, in <module>
from sentence_transformers.util import disable_datasets_caching, is_datasets_available
File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/sentence_transformers/util.py", line 17, in <module>
import torch
File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/torch/__init__.py", line 1477, in <module>
from .functional import * # noqa: F403
File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/torch/functional.py", line 9, in <module>
import torch.nn.functional as F
File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/torch/nn/__init__.py", line 1, in <module>
from .modules import * # noqa: F403
File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/torch/nn/modules/__init__.py", line 35, in <module>
from .transformer import TransformerEncoder, TransformerDecoder, \
File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/torch/nn/modules/transformer.py", line 20, in <module>
device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'),
/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/utils/tensor_numpy.cpp:84.)
device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'),
annotated-types==0.7.0
anyio==4.6.2.post1
certifi==2024.8.30
charset-normalizer==3.4.0
click==8.1.7
click-default-group==1.2.4
distro==1.9.0
filelock==3.16.1
fsspec==2024.10.0
h11==0.14.0
httpcore==1.0.7
httpx==0.27.2
huggingface-hub==0.26.5
idna==3.10
Jinja2==3.1.4
jiter==0.8.0
joblib==1.4.2
llm==0.19
llm-ollama==0.7.1
llm-python==0.1
llm-sentence-transformers==0.2
MarkupSafe==3.0.2
mpmath==1.3.0
networkx==3.4.2
numpy==2.1.3
ollama==0.4.3
openai==1.56.2
packaging==24.2
pillow==11.0.0
pluggy==1.5.0
puremagic==1.28
pydantic==2.10.3
pydantic_core==2.27.1
python-dateutil==2.9.0.post0
python-ulid==3.0.0
PyYAML==6.0.2
regex==2024.11.6
requests==2.32.3
safetensors==0.4.5
scikit-learn==1.5.2
scipy==1.14.1
sentence-transformers==3.3.1
six==1.17.0
sniffio==1.3.1
sqlite-fts4==1.0.3
sqlite-migrate==0.1b0
sqlite-utils==3.38
sympy==1.13.3
tabulate==0.9.0
threadpoolctl==3.5.0
tokenizers==0.21.0
torch==2.2.2
tqdm==4.67.1
transformers==4.47.0
typing_extensions==4.12.2
urllib3==2.2.3
Dec08
- (from above I see I'm using
numpy==2.1.3
) - it feels like the issue above is with PyTorch
- even
python3 -c "import torch; print(torch.__version__)"
fails with the NumPy alert pip show torch
->Name: torch; Version: 2.2.2
- hmm Pytorch is at 2.5.1 now. If I upgrade Pytorch will something else break?
- I see some people suggesting
pip install --force-reinstall -v "numpy==1.25.2"
but that was months ago - I try upgrading torch, but can't seem to get it to handle a version beyond 2.2. Oh dang, Intel Macs deprecated after that..
- what will break if I downgrade numpy to <2?
Edited: | Tweet this! | Search Twitter for discussion