Let’s build a simple AI agent in Python to illustrate the concept.
Accept user input
Plan steps to reach the goal
Execute actions
Learn or adapt slightly
How we are going to make it happen, let's break it down in small pices.
A Python AI agent that:
- Takes your inputs or question.
- Uses a local open-source LLM (like Llama 2, Mistral 7B, Phi-2, or TinyLlama 1.1B).
- Plans steps or answers questions.
- Optionally stores knowledge.
Technologies:
- Python
- Ollama or LM Studio or Local Hugging Face Transformers
- FastAPI (optional, if you want a web API)
- LangChain (optional, for easy LLM chaining)
Overview
1. Client uploads PDF/image OR asks a question → FastAPI generates a request_id → enqueues an ARQ job with that ID.
2. ARQ worker:
- Processes PDF/image → stores text → updates vector DB.
- Stores result in Redis with request_id as key.
3. When done:
- The worker optionally pushes the result via WebSocket if the client is connected.
- Or client can poll /result/{request_id} to get the status/result.
Requirements:
pip install fastapi uvicorn arq redis websockets langchain-community chromadb openai PyMuPDF pytesseract pillow
Redis must be running:
Either you can install redis server on local or can use docker to run it.
docker run -p 6379:6379 redis
ARQ Worker (worker.py)
FastAPI with request ID & result polling (main.py)
Run everything
uvicorn main:app --reload
arq worker.WorkerSettings
Here are example curl commands for each API endpoint in AI processing system.
Upload a PDF:
Upload an Image:
Ask a Question:
Get Result by Request ID:
Hope you find it helpful!!