Learning Library

← Back to Library

Local Document QA Using Docling and Granite

4m • Unknown Channel • ai-ml • tutorial • intermediate • Watch on YouTube ↗

Key Points

The tutorial shows how to build a local document‑based question‑answering system using IBM’s open‑source Docling for format conversion and the Granite 3.1 model (run via Ollama) for large‑context text processing.
A six‑step Jupyter notebook guides you through environment setup, creating helper functions for format detection and Docling conversion (to markdown), chunking the document, storing chunks in a vector store, and wiring the retrieval‑augmented generation chain.
The example works with a 200‑page IBM Red Book PDF, demonstrating Docling’s ability to handle complex PDFs and Granite 3.1’s capacity to retrieve up to ten relevant chunks, exploiting its extensive context window.
The complete code is available on GitHub, and the process time varies with your local hardware; the presenter notes it takes roughly one minute on his machine.

Sections

Full Transcript

# Local Document QA Using Docling and Granite **Source:** [https://www.youtube.com/watch?v=eBJblHgDaRs](https://www.youtube.com/watch?v=eBJblHgDaRs) **Duration:** 00:04:04 ## Summary - The tutorial shows how to build a local document‑based question‑answering system using IBM’s open‑source Docling for format conversion and the Granite 3.1 model (run via Ollama) for large‑context text processing. - A six‑step Jupyter notebook guides you through environment setup, creating helper functions for format detection and Docling conversion (to markdown), chunking the document, storing chunks in a vector store, and wiring the retrieval‑augmented generation chain. - The example works with a 200‑page IBM Red Book PDF, demonstrating Docling’s ability to handle complex PDFs and Granite 3.1’s capacity to retrieve up to ten relevant chunks, exploiting its extensive context window. - The complete code is available on GitHub, and the process time varies with your local hardware; the presenter notes it takes roughly one minute on his machine. ## Sections - [00:00:00](https://www.youtube.com/watch?v=eBJblHgDaRs&t=0s) **Untitled Section** - - [00:03:07](https://www.youtube.com/watch?v=eBJblHgDaRs&t=187s) **Docling and Granite QA Demo** - The speaker demonstrates converting a PDF to Markdown with Docling, feeding it to Granite 3.1 to create a local question‑answering system, and shares the complete code on GitHub. ## Full Transcript

0:00Welcome to this tutorial on creating a document based question and answering system 0:04that works locally on your own computer. 0:06Using IBM's OpenSource tool kit Docling in conjunction with the Granite 3.1 model 0:12Docling facilitates the passing and conversion of various document formats, 0:16while Granite 3.1 with this extensive context window enables efficient processing of large textual data. 0:23You're going to need to install Ollama on your machine as that's what we're going to use to run the Granite 3.1 model. 0:30View the dependency so downloaded as part of the tutorial. 0:34I recommend that you go through the tutorial step by step 0:37in order to ensure that you're following along and understand what each step does. 0:42The code for this tutorial is also available on our GitHub here as the Jupyter Notebook. 0:47And I'm going to walk you through all of those steps and go through and run this in a few moments. 0:52So let's run through that Jupyter notebook. 0:55There are six steps in this notebook, one to set up the environment. 1:00Another for creating a helper function to detect the document format. 1:06Step three is a function that we're going to be calling Docling in to do the document conversion. 1:12In our case, we're actually going to be converting the document to markdown so you can see what Docling doing. 1:18Step four is where we're going to set up our question and answer chain, 1:22using the document, splitting it into chunks, storing it in a vector store, and then setting up our language model. 1:29Step five is where we're going to set up our question answering interface. 1:32And step six is where we actually perform the question answering itself. 1:36Before I go through and run sales in the notebook, I wanted to show you the document that we're going to use. 1:40What I have in front of me here is an IBM red book that we downloaded from the IBM website. 1:45And it's about creating OpenShift, multiple architecture clusters with IBM Power. 1:50As you can see, this PDF document is 12 chapters long and it has nearly 200 pages in it. 1:58It's quite a large document. 2:01So what we're going to do is let's run through these cells and see the output. 2:05So first off, we're pulling our models locally using Ollama. 2:08I'm assuming here that you've already installed Ollama on your machine. 2:12Well, then going to pull some dependencies down. 2:15Now here is the dependency on Docling 2.0. 2:19Within going to import those things into our project. 2:23We're going to create a helper function here. 2:27We're going to create a document conversion function here. 2:31We then set up our question answering chain. 2:33As I shown you, 2:34note that I have set my retriever here to retrieve ten documents from the retrieval chain in order to answer questions. 2:41This is to take advantage of granite 3.1's large context window. 2:46Then the interface is getting put together and now performing the question and answering. 3:01Depending on the type of machine that you have locally that will make a determination on how long this process takes. 3:07At the moment it's taking me one minute and I think I have one more question waiting to be answered. 3:15Great. 3:16And then we go. 3:17Here are all my answers. 3:21So I've just showing you how you can use Docling and Granite 3.1 3:27in order to build a question and answering system using chat that works locally on your computer 3:32and in our function to convert the document, we've actually chosen to convert the document 3:37to markdown so that you can see the work that Docling did. 3:42So if I open up this markdown file here, 3:44you can actually see this is Docling converting the PDF that I provided it into a format that we then sent to the LLM. 3:51And as you can see, it's done a pretty good job. 3:58The code for all of this, including this IBM red box available in our GitHub. 4:02Have a play with it and let me know what you think.