Learning Library

← Back to Library

Local Document QA Using Docling and Granite

Key Points

  • The tutorial shows how to build a local document‑based question‑answering system using IBM’s open‑source Docling for format conversion and the Granite 3.1 model (run via Ollama) for large‑context text processing.
  • A six‑step Jupyter notebook guides you through environment setup, creating helper functions for format detection and Docling conversion (to markdown), chunking the document, storing chunks in a vector store, and wiring the retrieval‑augmented generation chain.
  • The example works with a 200‑page IBM Red Book PDF, demonstrating Docling’s ability to handle complex PDFs and Granite 3.1’s capacity to retrieve up to ten relevant chunks, exploiting its extensive context window.
  • The complete code is available on GitHub, and the process time varies with your local hardware; the presenter notes it takes roughly one minute on his machine.

Full Transcript

# Local Document QA Using Docling and Granite **Source:** [https://www.youtube.com/watch?v=eBJblHgDaRs](https://www.youtube.com/watch?v=eBJblHgDaRs) **Duration:** 00:04:04 ## Summary - The tutorial shows how to build a local document‑based question‑answering system using IBM’s open‑source Docling for format conversion and the Granite 3.1 model (run via Ollama) for large‑context text processing. - A six‑step Jupyter notebook guides you through environment setup, creating helper functions for format detection and Docling conversion (to markdown), chunking the document, storing chunks in a vector store, and wiring the retrieval‑augmented generation chain. - The example works with a 200‑page IBM Red Book PDF, demonstrating Docling’s ability to handle complex PDFs and Granite 3.1’s capacity to retrieve up to ten relevant chunks, exploiting its extensive context window. - The complete code is available on GitHub, and the process time varies with your local hardware; the presenter notes it takes roughly one minute on his machine. ## Sections - [00:00:00](https://www.youtube.com/watch?v=eBJblHgDaRs&t=0s) **Untitled Section** - - [00:03:07](https://www.youtube.com/watch?v=eBJblHgDaRs&t=187s) **Docling and Granite QA Demo** - The speaker demonstrates converting a PDF to Markdown with Docling, feeding it to Granite 3.1 to create a local question‑answering system, and shares the complete code on GitHub. ## Full Transcript
0:00Welcome to this tutorial on creating a document based question and answering system 0:04that works locally on your own computer. 0:06Using IBM's OpenSource tool kit Docling in conjunction with the Granite 3.1 model 0:12Docling facilitates the passing and conversion of various document formats, 0:16while Granite 3.1 with this extensive context window enables efficient processing of large textual data. 0:23You're going to need to install Ollama on your machine as that's what we're going to use to run the Granite 3.1 model. 0:30View the dependency so downloaded as part of the tutorial. 0:34I recommend that you go through the tutorial step by step 0:37in order to ensure that you're following along and understand what each step does. 0:42The code for this tutorial is also available on our GitHub here as the Jupyter Notebook. 0:47And I'm going to walk you through all of those steps and go through and run this in a few moments. 0:52So let's run through that Jupyter notebook. 0:55There are six steps in this notebook, one to set up the environment. 1:00Another for creating a helper function to detect the document format. 1:06Step three is a function that we're going to be calling Docling in to do the document conversion. 1:12In our case, we're actually going to be converting the document to markdown so you can see what Docling doing. 1:18Step four is where we're going to set up our question and answer chain, 1:22using the document, splitting it into chunks, storing it in a vector store, and then setting up our language model. 1:29Step five is where we're going to set up our question answering interface. 1:32And step six is where we actually perform the question answering itself. 1:36Before I go through and run sales in the notebook, I wanted to show you the document that we're going to use. 1:40What I have in front of me here is an IBM red book that we downloaded from the IBM website. 1:45And it's about creating OpenShift, multiple architecture clusters with IBM Power. 1:50As you can see, this PDF document is 12 chapters long and it has nearly 200 pages in it. 1:58It's quite a large document. 2:01So what we're going to do is let's run through these cells and see the output. 2:05So first off, we're pulling our models locally using Ollama. 2:08I'm assuming here that you've already installed Ollama on your machine. 2:12Well, then going to pull some dependencies down. 2:15Now here is the dependency on Docling 2.0. 2:19Within going to import those things into our project. 2:23We're going to create a helper function here. 2:27We're going to create a document conversion function here. 2:31We then set up our question answering chain. 2:33As I shown you, 2:34note that I have set my retriever here to retrieve ten documents from the retrieval chain in order to answer questions. 2:41This is to take advantage of granite 3.1's large context window. 2:46Then the interface is getting put together and now performing the question and answering. 3:01Depending on the type of machine that you have locally that will make a determination on how long this process takes. 3:07At the moment it's taking me one minute and I think I have one more question waiting to be answered. 3:15Great. 3:16And then we go. 3:17Here are all my answers. 3:21So I've just showing you how you can use Docling and Granite 3.1 3:27in order to build a question and answering system using chat that works locally on your computer 3:32and in our function to convert the document, we've actually chosen to convert the document 3:37to markdown so that you can see the work that Docling did. 3:42So if I open up this markdown file here, 3:44you can actually see this is Docling converting the PDF that I provided it into a format that we then sent to the LLM. 3:51And as you can see, it's done a pretty good job. 3:58The code for all of this, including this IBM red box available in our GitHub. 4:02Have a play with it and let me know what you think.