iTranslated by AI
Building a Custom Block to Read PDF Files in Dify
Hello everyone, are you using Dify? Dify is extremely convenient because it allows you to easily create apps such as agents and RAG using low-code, and you can easily monitor the usage of the apps and the inner workings of the system!
One point of concern is that, as of this writing (June 2024), it does not support PDF file input. There are many situations in business where you want to analyze PDF files, right?
However, with a bit of ingenuity, you can create a block that can read PDF files from local or shared folders!
How it works
The mechanism is as follows:
- Input the file path into Dify
- Send the file path to a self-hosted PDF reading server
- The server analyzes the file, extracts the text, and returns it
- Start text analysis!
Let's get started!
0. Set up Dify locally
First, you need to set up Dify locally!
Please refer to other people's articles for this process.
The file path must be accessible from the PC where Dify is hosted, so it needs to be on your own PC or a shared folder. I have set mine up using WSL.
1. Input the file path into Dify
Once Dify is set up, start creating an app using a workflow.

In the Start block, the field type is set to short text and the variable name is 'path'.

2. Send the file path to a self-hosted server
To send the file path to the server, use a Code block.

The input variable is the 'path' specified earlier.
Select requests in the advanced dependencies.
The code for Python 3 is as follows.
This code sends the path to the server; if the reading is successful on the server side, it retrieves the PDF text, and if it fails, it returns an error message.
def main(path: str):
# Set the server URL
server_url = 'http://XXX.XXX.XXX.XXX:5000/'
# Send a POST request to the server and pass the file path
response = requests.post(server_url, data={'path': path})
# Check if the HTTP status code is 200 (Success)
if response.status_code == 200:
response_data = response.json()
# Check if the status is 'success'
if response_data['status'] == 'success':
# Retrieve the text as the result
result = response_data['text']
else:
# Retrieve the error message as the result
result = f"Error: {response_data['message']}"
else:
# Error message for failed HTTP request
result = f"Failed to upload file. Status code: {response.status_code}"
# Return the result in dictionary format
return {
'result': result
}
Please specify the IP address of the PDF reading server (explained in the next section) for server_url.
The output variable is result.
3. Server receives the file path, analyzes the file, and returns the text
You can set up the PDF reading server on the same PC where Dify is running.
Download the packages:
pip install flask request jsonify pypdf2
We will build a Flask server that reads PDFs using PyPDF2.
The code is as follows:
from flask import Flask, request, jsonify
from PyPDF2 import PdfReader
# Create an instance of the Flask application
app = Flask(__name__)
# Define the root endpoint and handle POST requests
@app.route('/', methods=['POST'])
def upload_file():
response_data = {}
pdf_path = request.form['path'] # Get the path of the PDF file sent from the client
if pdf_path: # Check if a PDF file path is provided
try:
with open(pdf_path, 'rb') as pdf_file: # Read the PDF file from the specified path
reader = PdfReader(pdf_file)
text = ""
for page in reader.pages: # Extract text from all pages of the PDF
text += page.extract_text()
response_data['status'] = 'success' # Set the status if processing is successful
response_data['text'] = text
except Exception as e: # Error handling
response_data['status'] = 'error'
response_data['message'] = str(e)
else:
response_data['status'] = 'error' # Set the status if no file path is provided
response_data['message'] = 'No file path provided'
return jsonify(response_data) # Return response data in JSON format
if __name__ == '__main__':
# Start the application on its own IP (0.0.0.0) and port 5000
app.run(host='0.0.0.0', port=5000, debug=True)
Run the server!
python XXX.py
4. Start text analysis!
From here, it's up to you to build the blocks however you like! Whether it's for text extraction or having an LLM read it... being able to load PDF files significantly expands the possibilities for your apps!
Summary
Thank you for reading this far. Handling PDF files is a common scenario in business. By applying the method shown here, you can extend Dify's functionality to support a wider variety of data sources. Please give it a try. Let's continue to deepen our learning for efficient app development with Dify.
I hope this blog post helps your projects. If you have any questions or feedback, please let me know in the comments. Happy developing!
Discussion