(2024-06-24) Adding ChatGPT-Like functionality to MacOS Spotlight Search

Pranav Dhoolia on Adding ChatGPT-Like functionality to MacOS Spotlight Search. In one of my YouTube videos, I locally hosted LLMs using Ollama, and interacted with them in a ChatGPT like interface using Ollama Web UI also hosted locally. This continues that journey. Here, I build a solution using a locally hosted LLM and LangChain (the most popular framework for building LLM solutions).

...technique where we first retrieve information from a database or corpus, and then pass it on to an LLM to generate a response is also referred to as a Retrieval Augmented Generation solution. Since we’ll do everything local, let’s call this Local RAG.

Let’s begin by creating a python environment... Let’s now install langchain... For this exercise I want to search thru PDF documents. So let’s install pypdf... Let’s first write a python program to use Spotlight programmatically. Let’s create a shell program which receives the query as an input... Let’s now write a function to search using spotlight for files matching this query. We’ll use the mdfind command and run it as a subprocess. We’ll then call this from our main function. Let’s run it now: python spotlight.py "search_with_spotlight using mdfind"... We should see the program file in results.

While Spotlight gets us the files of interest containing certain keywords. It doesn’t precisely answer our questions... Based on keywords like resume, coursera, generative AI; Spotlight may get us the resume files of interest. But it will not give us a short summary including contact details for the candidates. Let’s now use LLMs to do this for us. Let’s begin by writing our driver function... Let’s write the get_docs function to extract text documents from pdf files. Let’s now build our QA chain. We’ll use the LangChain Expression Language syntax for this. Let’s start by defining the Prompt, the LLM model, and the OutputParser... Let’s now build the data to be injected into the prompt. we’ll use the docs, question, and doc_type expected while invoking the chain.

One of the reasons I really love LangChain, is that with LangSmith, it offers a very powerful debugging experience.


Edited:    |       |    Search Twitter for discussion