(2024-12-01) Trying LLM for my Local Notes and Ebooks

Trying LLM for my Local Notes and Ebooks

  • I may also try to revive my CoachBot stuff, use a longer memory just for kicks.

What do I want to include

Why? What's the target outcome/benefit/UI?

  • conversational UI to get answers based on good/smart references (and ideally, point to the origins)
  • inspiration to read at least parts of books I haven't touched

Probably taking RAG approach, not (2023-02-09) Willison Training Nanogpt Entirely On Content From My Blog.

Start out trying approach at (2024-06-24) Adding ChatGPT-Like functionality to MacOS Spotlight Search, decide to wait/read some more...., Spotlight giving me a whole book doesn't help, it's too big to pass along.

Will probably install Simon Willison's LLM library first...

Dec04

  • make /py3/genai/ directory
  • create .venv inside; activate
  • python -m pip install llm
  • I don't have any paid accounts anymore: what's most cost-effective for this use?
    • I don't have a sense of what will cost a token. If I chunk an ebook by the paragraph, is it 1 token/paragraph?
  • my MacBookPro is from 2019, has Intel i7 (w 16GB), am I going to be able to use anything local?
    • Simon says: I expect that should run the 8B or 3B models OK, I recommend trying Ollama
  • this "selector" recommends:
    • Llama-3.2-90B-Vision
    • Llama-3.1-70B

Dec07

  • go to Ollama site, download/install Mac app, launch, run ollama run llama3.2 in terminal, do a tiny chat, it works.
  • install Simon's plugin llm install llm-ollama
  • the llama3.2 I installed is the 3B model, so I'll stick with that.
  • llm -m llama3.2 'How much is 2+2?' works
  • moving to embedding instructions at (2023-09-04) Willison Llm Now Provides Tools For Working With Embeddings
  • llm install llm-sentence-transformers
  • pick an embedding model: mxbai-embed-large because it has 10x the downloads of any others
  • llm sentence-transformers register mxbai-embed-large
A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.1.3 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.
If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.
  • pip install -U numpy
    Requirement already satisfied: numpy in ./.venv/lib/python3.11/site-packages (2.1.3)
  • something led me to https://newreleases.io/project/pypi/sentence-transformers/release/3.1.1 This patch release fixes hard negatives mining for models that don't automatically normalize their embeddings and it lifts the numpy<2 restriction that was previously required. OK.
  • pip install sentence-transformers[train]==3.1.1
    zsh: no matches found: sentence-transformers[train]==3.1.1
  • llm install sentence-transformers[train]==3.1.1 -> same thing
  • hmm, I see sentence-transformers is at 3.3.1 -> llm install sentence-transformers[train]==3.3.1.
    zsh: no matches found: sentence-transformers[train]==3.3.1
  • ok how do I figure out which module is the issue?
  • simon says do llm install llm-python ->
A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.1.3 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.

Traceback (most recent call last):  File "/Users/billseitz/Documents/code/py3/genai/.venv/bin/llm", line 5, in <module>
    from llm.cli import cli
  File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/llm/cli.py", line 1887, in <module>
    load_plugins()
  File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/llm/plugins.py", line 25, in load_plugins
    pm.load_setuptools_entrypoints("llm")
  File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/pluggy/_manager.py", line 421, in load_setuptools_entrypoints
    plugin = ep.load()
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/importlib/metadata/__init__.py", line 202, in load
    module = import_module(match.group('module'))
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/llm_sentence_transformers.py", line 2, in <module>
    from sentence_transformers import SentenceTransformer
  File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/sentence_transformers/__init__.py", line 9, in <module>
    from sentence_transformers.backend import (
  File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/sentence_transformers/backend.py", line 11, in <module>
    from sentence_transformers.util import disable_datasets_caching, is_datasets_available
  File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/sentence_transformers/util.py", line 17, in <module>
    import torch
  File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/torch/__init__.py", line 1477, in <module>
    from .functional import *  # noqa: F403
  File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/torch/functional.py", line 9, in <module>
    import torch.nn.functional as F
  File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/torch/nn/__init__.py", line 1, in <module>
    from .modules import *  # noqa: F403
  File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/torch/nn/modules/__init__.py", line 35, in <module>
    from .transformer import TransformerEncoder, TransformerDecoder, \
  File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/torch/nn/modules/transformer.py", line 20, in <module>
    device: torch.device = torch.device(torch._C._get_default_device()),  # torch.device('cpu'),
/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/utils/tensor_numpy.cpp:84.)
  device: torch.device = torch.device(torch._C._get_default_device()),  # torch.device('cpu'),
Collecting llm-python
  Downloading llm_python-0.1-py3-none-any.whl.metadata (3.3 kB)
Requirement already satisfied: llm in ./.venv/lib/python3.11/site-packages (from llm-python) (0.19)
Requirement already satisfied: click in ./.venv/lib/python3.11/site-packages (from llm->llm-python) (8.1.7)
Requirement already satisfied: openai>=1.0 in ./.venv/lib/python3.11/site-packages (from llm->llm-python) (1.56.2)
Requirement already satisfied: click-default-group>=1.2.3 in ./.venv/lib/python3.11/site-packages (from llm->llm-python) (1.2.4)
Requirement already satisfied: sqlite-utils>=3.37 in ./.venv/lib/python3.11/site-packages (from llm->llm-python) (3.38)
Requirement already satisfied: sqlite-migrate>=0.1a2 in ./.venv/lib/python3.11/site-packages (from llm->llm-python) (0.1b0)
Requirement already satisfied: pydantic>=1.10.2 in ./.venv/lib/python3.11/site-packages (from llm->llm-python) (2.10.3)
Requirement already satisfied: PyYAML in ./.venv/lib/python3.11/site-packages (from llm->llm-python) (6.0.2)
Requirement already satisfied: pluggy in ./.venv/lib/python3.11/site-packages (from llm->llm-python) (1.5.0)
Requirement already satisfied: python-ulid in ./.venv/lib/python3.11/site-packages (from llm->llm-python) (3.0.0)
Requirement already satisfied: setuptools in ./.venv/lib/python3.11/site-packages (from llm->llm-python) (65.5.0)
Requirement already satisfied: pip in ./.venv/lib/python3.11/site-packages (from llm->llm-python) (24.0)
Requirement already satisfied: puremagic in ./.venv/lib/python3.11/site-packages (from llm->llm-python) (1.28)
Requirement already satisfied: anyio<5,>=3.5.0 in ./.venv/lib/python3.11/site-packages (from openai>=1.0->llm->llm-python) (4.6.2.post1)
Requirement already satisfied: distro<2,>=1.7.0 in ./.venv/lib/python3.11/site-packages (from openai>=1.0->llm->llm-python) (1.9.0)
Requirement already satisfied: httpx<1,>=0.23.0 in ./.venv/lib/python3.11/site-packages (from openai>=1.0->llm->llm-python) (0.27.2)
Requirement already satisfied: jiter<1,>=0.4.0 in ./.venv/lib/python3.11/site-packages (from openai>=1.0->llm->llm-python) (0.8.0)
Requirement already satisfied: sniffio in ./.venv/lib/python3.11/site-packages (from openai>=1.0->llm->llm-python) (1.3.1)
Requirement already satisfied: tqdm>4 in ./.venv/lib/python3.11/site-packages (from openai>=1.0->llm->llm-python) (4.67.1)
Requirement already satisfied: typing-extensions<5,>=4.11 in ./.venv/lib/python3.11/site-packages (from openai>=1.0->llm->llm-python) (4.12.2)
Requirement already satisfied: annotated-types>=0.6.0 in ./.venv/lib/python3.11/site-packages (from pydantic>=1.10.2->llm->llm-python) (0.7.0)
Requirement already satisfied: pydantic-core==2.27.1 in ./.venv/lib/python3.11/site-packages (from pydantic>=1.10.2->llm->llm-python) (2.27.1)
Requirement already satisfied: sqlite-fts4 in ./.venv/lib/python3.11/site-packages (from sqlite-utils>=3.37->llm->llm-python) (1.0.3)
Requirement already satisfied: tabulate in ./.venv/lib/python3.11/site-packages (from sqlite-utils>=3.37->llm->llm-python) (0.9.0)
Requirement already satisfied: python-dateutil in ./.venv/lib/python3.11/site-packages (from sqlite-utils>=3.37->llm->llm-python) (2.9.0.post0)
Requirement already satisfied: idna>=2.8 in ./.venv/lib/python3.11/site-packages (from anyio<5,>=3.5.0->openai>=1.0->llm->llm-python) (3.10)
Requirement already satisfied: certifi in ./.venv/lib/python3.11/site-packages (from httpx<1,>=0.23.0->openai>=1.0->llm->llm-python) (2024.8.30)
Requirement already satisfied: httpcore==1.* in ./.venv/lib/python3.11/site-packages (from httpx<1,>=0.23.0->openai>=1.0->llm->llm-python) (1.0.7)
Requirement already satisfied: h11<0.15,>=0.13 in ./.venv/lib/python3.11/site-packages (from httpcore==1.*->httpx<1,>=0.23.0->openai>=1.0->llm->llm-python) (0.14.0)
Requirement already satisfied: six>=1.5 in ./.venv/lib/python3.11/site-packages (from python-dateutil->sqlite-utils>=3.37->llm->llm-python) (1.17.0)
Downloading llm_python-0.1-py3-none-any.whl (7.2 kB)
Installing collected packages: llm-python
Successfully installed llm-python-0.1
  • llm python -m pip freeze ->
A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.1.3 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.

Traceback (most recent call last):  File "/Users/billseitz/Documents/code/py3/genai/.venv/bin/llm", line 5, in <module>
    from llm.cli import cli
  File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/llm/cli.py", line 1887, in <module>
    load_plugins()
  File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/llm/plugins.py", line 25, in load_plugins
    pm.load_setuptools_entrypoints("llm")
  File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/pluggy/_manager.py", line 421, in load_setuptools_entrypoints
    plugin = ep.load()
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/importlib/metadata/__init__.py", line 202, in load
    module = import_module(match.group('module'))
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/llm_sentence_transformers.py", line 2, in <module>
    from sentence_transformers import SentenceTransformer
  File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/sentence_transformers/__init__.py", line 9, in <module>
    from sentence_transformers.backend import (
  File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/sentence_transformers/backend.py", line 11, in <module>
    from sentence_transformers.util import disable_datasets_caching, is_datasets_available
  File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/sentence_transformers/util.py", line 17, in <module>
    import torch
  File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/torch/__init__.py", line 1477, in <module>
    from .functional import *  # noqa: F403
  File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/torch/functional.py", line 9, in <module>
    import torch.nn.functional as F
  File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/torch/nn/__init__.py", line 1, in <module>
    from .modules import *  # noqa: F403
  File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/torch/nn/modules/__init__.py", line 35, in <module>
    from .transformer import TransformerEncoder, TransformerDecoder, \
  File "/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/torch/nn/modules/transformer.py", line 20, in <module>
    device: torch.device = torch.device(torch._C._get_default_device()),  # torch.device('cpu'),
/Users/billseitz/Documents/code/py3/genai/.venv/lib/python3.11/site-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/utils/tensor_numpy.cpp:84.)
  device: torch.device = torch.device(torch._C._get_default_device()),  # torch.device('cpu'),
annotated-types==0.7.0
anyio==4.6.2.post1
certifi==2024.8.30
charset-normalizer==3.4.0
click==8.1.7
click-default-group==1.2.4
distro==1.9.0
filelock==3.16.1
fsspec==2024.10.0
h11==0.14.0
httpcore==1.0.7
httpx==0.27.2
huggingface-hub==0.26.5
idna==3.10
Jinja2==3.1.4
jiter==0.8.0
joblib==1.4.2
llm==0.19
llm-ollama==0.7.1
llm-python==0.1
llm-sentence-transformers==0.2
MarkupSafe==3.0.2
mpmath==1.3.0
networkx==3.4.2
numpy==2.1.3
ollama==0.4.3
openai==1.56.2
packaging==24.2
pillow==11.0.0
pluggy==1.5.0
puremagic==1.28
pydantic==2.10.3
pydantic_core==2.27.1
python-dateutil==2.9.0.post0
python-ulid==3.0.0
PyYAML==6.0.2
regex==2024.11.6
requests==2.32.3
safetensors==0.4.5
scikit-learn==1.5.2
scipy==1.14.1
sentence-transformers==3.3.1
six==1.17.0
sniffio==1.3.1
sqlite-fts4==1.0.3
sqlite-migrate==0.1b0
sqlite-utils==3.38
sympy==1.13.3
tabulate==0.9.0
threadpoolctl==3.5.0
tokenizers==0.21.0
torch==2.2.2
tqdm==4.67.1
transformers==4.47.0
typing_extensions==4.12.2
urllib3==2.2.3

Dec08

  • (from above I see I'm using numpy==2.1.3)
  • it feels like the issue above is with PyTorch
  • even python3 -c "import torch; print(torch.__version__)" fails with the NumPy alert
  • pip show torch -> Name: torch; Version: 2.2.2
  • hmm Pytorch is at 2.5.1 now. If I upgrade Pytorch will something else break?
  • I see some people suggesting pip install --force-reinstall -v "numpy==1.25.2" but that was months ago
  • I try upgrading torch, but can't seem to get it to handle a version beyond 2.2. Oh dang, Intel Macs deprecated after that..
  • what will break if I downgrade numpy to <2?

Edited:    |       |    Search Twitter for discussion