Index Confluence Data¶
Availability
Indexing tools are available in the Next environment (Release 1.7.0) and replace legacy Datasources/Datasets. For context, see Release Notes 1.7.0 and the Indexing Overview.
This guide provides a complete step-by-step walkthrough for indexing Confluence data and then searching or chatting with the indexed content using ELITEA's AI-powered tools.
Overview¶
Confluence indexing allows you to create searchable indexes from your Confluence knowledge base content:
- Pages & Documentation: Meeting notes, project documentation, how-to guides, and knowledge articles
- Spaces: Organized content collections for teams, projects, or departments
- Labels & Metadata: Content categorization and tagging systems
- Attachments: Document files, images, and other media attached to pages
- Comments: User discussions and collaborative input on pages
What you can do with indexed Confluence data:
- Semantic Search: Find information across your Confluence spaces using natural language queries
- Context-Aware Chat: Get AI-generated answers from your knowledge base with citations
- Cross-Space Discovery: Search across multiple Confluence spaces and projects
- Knowledge Extraction: Transform Confluence content into searchable organizational knowledge
- Content Analysis: Analyze patterns, gaps, and relationships in your documentation
Common use cases:
- Onboarding new team members by allowing them to ask questions about processes and documentation
- Finding specific procedures, policies, or best practices across your knowledge base
- Generating summaries or explanations from existing documentation
- Support ticket resolution using indexed knowledge articles and FAQ content
- Content audit and analysis to identify documentation gaps or outdated information
Prerequisites¶
Before indexing Confluence data, ensure you have:
- Confluence Credential: A Confluence API token or authentication credentials configured in ELITEA
- Vector Storage: PgVector selected in Settings → AI Configuration
- Embedding Model: Selected in AI Configuration (defaults available) → AI Configuration
- Confluence Toolkit: Configured with your Confluence instance details and credentials
Required Permissions¶
Your Confluence credential needs appropriate permissions based on what you want to index:
For Content Access: - Read access to Confluence spaces and pages - Permission to view the specific spaces you want to index
For Comprehensive Indexing: - Access to view attachments (if including attachments) - Permission to view comments (if including comments) - Access to both public and restricted content (based on your requirements)
Authentication Methods: - Basic Authentication: Username and API Key - Bearer Token: Confluence API token
Step-by-Step: Creating a Confluence Credential¶
- Generate Confluence API Token in your Atlassian account (Security → API Tokens)
- Create Credential in ELITEA: Navigate to Credentials → + Create → Confluence → enter details and save
Detailed Instructions
For complete credential setup steps including token generation and security best practices, see:
- Create a Credential
- Confluence Credential Setup
- Confluence Toolkit Guide (Token generation section)
Step-by-Step: Configure Confluence Toolkit¶
- Create Toolkit: Navigate to Toolkits → + Create → Confluence
- Configure Settings: Set base URL, space, hosting option, and assign your Confluence credential
- Enable Tools: Select
Index Data
,List Collections
,Search Index
,Stepback Search Index
,Stepback Summary Index
, andRemove Index
tools - Save Configuration
Tool Overview:¶
- Index Data: Creates searchable indexes from Confluence pages and content
- List Collections: Lists all available collections/indexes to verify what's been indexed
- Search Index: Performs semantic search across indexed content using natural language queries
- Stepback Search Index: Advanced search that breaks down complex questions into simpler parts for better results
- Stepback Summary Index: Generates summaries and insights from search results across indexed content
- Remove Index: Deletes existing collections/indexes when you need to clean up or start fresh
Detailed Instructions
For complete toolkit configuration including space setup and authentication options, see:
Step-by-Step: Index Confluence Data¶
Content Indexing (from Toolkit)¶
-
Open Toolkit Test Settings:
- Navigate to your Confluence toolkit's detail page
- In the Test Settings panel (right side), select a model (e.g.,
gpt-4o
)
-
Configure Index Data Tool:
- From the tool dropdown, select "Index Data"
- Configure the following parameters:
Parameter Description Example Value Collection Suffix * Suffix for collection name (required) kb
ordocs
Clean Index Remove existing index data before re-indexing ✓ (checked) or ✗ (unchecked) Chunking Tool Method for splitting content into chunks Default chunking or custom Include Restricted Content Index pages with restricted access ✓ (checked) or ✗ (unchecked) Include Archived Content Index archived pages and spaces ✓ (checked) or ✗ (unchecked) Include Attachments Index attachment content ✓ (checked) or ✗ (unchecked) Include Comments Index page comments ✓ (checked) or ✗ (unchecked) Include Labels Index page labels ✓ (checked) or ✗ (unchecked) Keep Markdown Format Preserve Markdown formatting in indexed content ✓ (checked) or ✗ (unchecked) Keep Newlines Preserve line breaks and formatting ✓ (checked) or ✗ (unchecked) Bins With Llm Use LLM for content binning/organization ✓ (checked) or ✗ (unchecked) -
Run Confluence Indexing:
- Click "Run Tool"
- Wait for completion (may take several minutes for large spaces)
- Check the output for success confirmation or error messages
¶
Verification: Confirm Index Success¶
After indexing completes, verify the index was created successfully:
Method 1: Using Test Settings (Technical Verification)¶
-
Use List Collections Tool:
- In Test Settings, select "List Collections" tool
- Run tool to see all available collections
- Look for your collection with the specified suffix
-
Test Basic Search:
- Select "Search Index" tool
- Simple Query: e.g.,
documentation setup process
- Collection Suffix: Your specified suffix
- Run tool and verify relevant results are returned
Search and Chat with Indexed Data¶
Once your Confluence data is indexed, you can use the toolkit to search and interact with your content in multiple ways:
Using Toolkit in Conversations and Agents¶
Your Confluence toolkit can be used in two main contexts:
- In Conversations: Add the toolkit as a participant to ask questions and search your indexed Confluence data
- In Agents: Include the toolkit when creating AI agents to give them access to your Confluence knowledge base
How to use:
- Start a New Conversation or Create an Agent
- Add Toolkit as Participant: Select your Confluence toolkit from the available toolkits
- Ask Natural Language Questions: The toolkit will automatically search your indexed data and provide relevant answers with citations
Real-Life Example Workflow¶
Let's walk through a complete example of indexing and using a company knowledge base:
Step 1: Setup Confluence Toolkit for Company Knowledge Base
Configure Confluence Toolkit:
- Base URL:
https://yourcompany.atlassian.net/wiki
- Space:
KB
(Knowledge Base space key) - Hosting Option:
Cloud
- Credential: Your Confluence API token
- Tools enabled:
Index Data
,List Collections
,Search Index
,Stepback Search Index
,Stepback Summary Index
,Remove Index
Step 2: Use in Conversations
Add toolkit to a conversation and start asking questions:
- Navigate to Conversations → + New Conversation
- In the participants section click + to add Toolkit
- Select your Confluence toolkit from the available toolkits list
Step 3: Index the Knowledge Base
- User Request in Chat:
"Index our knowledge base with collection suffix 'kb'. Include attachments and labels but exclude comments. Keep markdown formatting and clean any existing index first."
Step 4: Verify Index Creation
- User Request:
"List my collections"
Confluence Toolkit Response:
This confirms the indexing completed successfully and your knowledge base collection is available for searching.
Step 5: Start the conversation and begin asking questions about your indexed content
-
User Request: "What is our company's vacation policy?"
-
Confluence Toolkit: "Based on your indexed knowledge base, here's your company's vacation policy:
- Annual Leave Entitlement: All full-time employees are entitled to 25 days of annual leave per year
- Accrual System: Vacation days accrue monthly at a rate of 2.08 days per month
- Carryover Policy: Up to 5 unused vacation days can be carried over to the next year...
Source: HR Policies → Employee Benefits → Vacation Policy (Page ID: 12345)"
-
User Request: "How do I set up a new development environment?"
-
Confluence Toolkit: "Here's the step-by-step process for setting up a new development environment:
Prerequisites:
- Install Docker Desktop
- Obtain access credentials from the DevOps team
- Clone the main repository from GitHub
Setup Steps:
- Run the environment setup script:
./scripts/setup-dev.sh
- Configure your local environment variables...
Source: Developer Documentation → Environment Setup → Development Environment (Page ID: 67890)"
Troubleshooting & Tips¶
Common Errors and Solutions¶
"Space not found" or "Authentication failed":
- Verify your Confluence credential has the correct API token
- Ensure the space key is exact and case-sensitive (e.g.,
KB
, notkb
) - Check that your token has appropriate permissions for the space
"API rate limit exceeded":
- Large spaces may hit Confluence API limits
- Reduce Max Pages or use more specific filtering (labels, CQL)
- Wait and retry, or consider indexing in smaller batches
"No documents indexed":
- Check your label or CQL filters aren't too restrictive
- Verify the space contains pages matching your criteria
- Try indexing without filters first, then add restrictions
"Vector database connection failed" or "PgVector errors":
- Ensure PgVector is properly configured in Settings → AI Configuration
- Verify the vector database is running and accessible
- Check connection credentials and database permissions
- Restart the vector database service if connection issues persist
"Embedding model not found" or "Embedding errors":
- Verify an embedding model is selected in AI Configuration
- Check if the embedding model is properly downloaded/initialized
- Try switching to a different embedding model (e.g., text-embedding-ada-002)
- Ensure sufficient system resources for the embedding model to load
Performance and Scope Considerations¶
For Large Confluence Spaces:
- Use specific label filters:
label="documentation"
orlabel="public"
- Use CQL queries to target specific content:
space="KB" AND created>="2024-01-01"
- Consider indexing by space hierarchy: index parent pages first, then children
- Set reasonable Max Pages limits: start with 100-500 pages for testing
Search Result Quality¶
If search returns few/no results:
- Lower the cut-off score from 0.5 to 0.35 or 0.3
- Increase search_top from 10 to 20 or 30
- Try rephrasing your query with different keywords
- Verify the indexed content contains relevant information for your query
For better search quality:
- Include both pages and attachments for comprehensive coverage
- Use natural language queries rather than exact keyword matches
- Leverage stepback search for complex questions that require reasoning
- Create separate indexes for different content types (public vs internal, different spaces)
Content-Specific Indexing Tips¶
For Knowledge Base Content:
- Focus on pages with
public
ordocumentation
labels - Include attachments for comprehensive policy and procedure documents
- Consider excluding comments unless they contain valuable information
For Project Documentation:
- Use space-specific indexing for better organization
- Include both current and archived content for historical context
- Index meeting notes and project updates separately from technical documentation
For Troubleshooting Guides:
- Include pages with labels like
troubleshooting
,faq
, orsupport
- Index both internal and customer-facing content
- Consider including comments as they often contain additional solutions
References¶
Related Documentation
For additional information and detailed setup instructions, see:
- Indexing Overview - General indexing concepts and features
- Create a Credential - Step-by-step credential creation guide
- How to Use Credentials - Credential management and Confluence setup
- Toolkits Menu - Toolkit configuration and management
- Confluence Toolkit Integration Guide - Complete Confluence toolkit reference
- AI Configuration - Vector storage and embedding model setup
- Chat Menu - Creating conversations and adding toolkit participants