Unstructured
This notebook covers how to use Unstructured
document loader to load files of many types. Unstructured
currently supports loading of text files, powerpoints, html, pdfs, images, and more.
Please see this guide for more instructions on setting up Unstructured locally, including setting up required system dependencies.
Overview
Integration details
Class | Package | Local | Serializable | JS support |
---|---|---|---|---|
UnstructuredLoader | langchain_unstructured | ✅ | ❌ | ✅ |
Loader features
Source | Document Lazy Loading | Native Async Support |
---|---|---|
UnstructuredLoader | ✅ | ❌ |
Setup
Credentials
By default, langchain-unstructured
installs a smaller footprint that requires offloading of the partitioning logic to the Unstructured API, which requires an API key. If you use the local installation, you do not need an API key. To get your API key, head over to this site and get an API key, and then set it in the cell below:
import getpass
import os
if "UNSTRUCTURED_API_KEY" not in os.environ:
os.environ["UNSTRUCTURED_API_KEY"] = getpass.getpass(
"Enter your Unstructured API key: "
)