Automating Industry Insights: A Deep Dive into Report Generation with CrewAI and Neo4j

Introduction

In today's data-driven landscape, the ability to quickly generate insightful reports is crucial for informed decision-making. Manually compiling data from various sources, analyzing trends, and synthesizing findings into a coherent report can be a time-consuming and labor-intensive process. This tutorial demonstrates how to automate this process using a powerful combination of CrewAI, an agent orchestration framework, and Neo4j, a leading graph database. We will build a system that leverages specialized AI agents to gather industry-specific data, analyze relevant news, and produce comprehensive, well-structured reports.

Understanding the Architecture

Our automated report generation system is built around a crew of three distinct AI agents, each with a specialized role and set of capabilities:

Data Researcher Agent: This agent is tasked with gathering and analyzing industry-specific data for organizations within a designated city and industry. Its responsibilities include identifying the number of companies, the count of public companies, their combined revenue, and listing the top-performing organizations based on employee numbers.
News Analyst Agent: This agent focuses on extracting and summarizing the latest news pertaining to the companies identified by the Data Researcher. It aims to provide a snapshot of current market trends, significant company developments, and sentiment analysis derived from recent news articles.
Report Writer Agent: This agent acts as the synthesizer, taking the information gathered and analyzed by the other two agents and compiling it into a polished, actionable markdown report. A key directive for this agent is to strictly adhere to the provided information, ensuring no unsupported data is included.

These agents work in a sequential flow, with the output of one agent feeding into the next, creating a robust pipeline for generating detailed industry reports tailored to specific geographical and industrial contexts.

Setting Up the Environment

Before we dive into the agent and tool implementation, it's essential to set up the necessary environment. This involves establishing a connection to your Neo4j database and configuring your Large Language Model (LLM) provider, in this case, OpenAI with GPT-4o.

Neo4j Connection

First, establish a connection to your Neo4j instance. For demonstration purposes, we use a publicly available demo database. In a production environment, you would replace these with your own database credentials.

# Neo4j connection setup
URI = "neo4j+s://demo.neo4jlabs.com"
AUTH = ("companies", "companies")
driver = GraphDatabase.driver(URI, auth=AUTH)

OpenAI API Key Configuration

Next, we need to set up our OpenAI API key. This key will be used by CrewAI to access the GPT-4o model for agent reasoning and text generation. It is recommended to use environment variables or a secure method for managing API keys.

# Set your OpenAI API key
os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI key: ")
llm = LLM(model=