Sync Data Sources
Learn how to synchronize and keep your data sources up-to-date in Noema AI.
Overview
Data synchronization ensures your AI assistants and workflows always have access to the latest information from your connected sources.
Sync Methods
1. Automatic Sync
Set up scheduled synchronization:
- Real-time: Instant updates (for supported sources)
- Hourly: Sync every hour
- Daily: Once per day at specific time
- Weekly: Weekly schedule
- Custom: Define your own schedule
2. Manual Sync
Trigger sync on-demand:
- Navigate to Sources
- Select the data source
- Click "Sync Now"
- Monitor progress
3. Event-Based Sync
Sync when events occur:
- File upload detected
- Database change notification
- Webhook trigger
- API callback
Configuring Sync Settings
Access Sync Configuration
- Select your data source
- Click "Sync Settings"
- Configure options
Sync Frequency
Choose how often to sync:
Schedule:
- Frequency: Every 6 hours
- Start Time: 00:00 UTC
- Days: Monday - Friday
- Timezone: Your timezone
Incremental vs. Full Sync
Incremental Sync (Recommended):
- Only syncs new or changed files
- Faster and more efficient
- Lower resource usage
Full Sync:
- Processes all files
- Use for data validation
- Slower but more thorough
File Filters
Control what gets synced:
- Include Patterns:
*.pdf, *.docx, *.xlsx - Exclude Patterns:
~*.*, temp_* - Size Limits: Max file size to process
- Date Filters: Only files modified after date
Monitoring Sync Activity
Sync Dashboard
View real-time sync status:
- Current sync progress
- Files being processed
- Estimated completion time
- Errors or warnings
Sync History
Review past synchronizations:
- Date/Time: When sync occurred
- Duration: How long it took
- Files Processed: Number of files
- Status: Success, partial, or failed
- Errors: Any issues encountered
Sync Logs
Access detailed logs:
[2025-11-05 10:30:15] Sync started
[2025-11-05 10:30:16] Connected to SharePoint
[2025-11-05 10:30:17] Found 45 new files
[2025-11-05 10:30:18] Processing: Document1.pdf
[2025-11-05 10:30:20] Extracted 1,234 words
[2025-11-05 10:30:21] Indexed successfully
...
[2025-11-05 10:35:42] Sync completed: 45/45 files
Handling Sync Conflicts
Conflict Detection
When files are modified in multiple places:
- Noema AI detects conflicts
- Presents resolution options
- Logs conflict details
Resolution Strategies
Choose how to handle conflicts:
- Source Wins: Always use source data
- Latest Wins: Use most recently modified
- Manual Review: Review and decide
- Keep Both: Maintain versions
Performance Optimization
Optimize Sync Speed
Improve performance:
- Filter Unnecessary Files: Sync only what you need
- Schedule During Off-Hours: Reduce network contention
- Use Incremental Sync: Faster than full sync
- Batch Processing: Process files in groups
Resource Management
Control resource usage:
- Concurrent Files: How many files to process simultaneously
- Bandwidth Limits: Throttle download speed
- Processing Queue: Prioritize certain file types
- Retry Logic: How to handle temporary failures
Advanced Sync Options
Metadata Extraction
Configure what metadata to extract:
- File properties (author, date, size)
- Document content (headings, keywords)
- Custom fields
- Tags and categories
Indexing Settings
Control how data is indexed:
- Full-Text Search: Enable/disable
- Language Detection: Auto-detect or specify
- Chunking: How to split large documents
- Embeddings: Generate AI embeddings for semantic search
Webhooks and Notifications
Get notified about sync events:
- Sync completion
- Errors or failures
- Threshold alerts (e.g., >100 files)
- Custom event triggers
Troubleshooting Sync Issues
Sync Stuck or Slow
Solutions:
- Check network connectivity
- Verify source availability
- Review file filters
- Reduce concurrent processing
- Check resource quotas
Files Not Syncing
Check:
- File permissions
- File type is supported
- File not locked/in use
- Within size limits
- Matches include patterns
Sync Errors
Common errors and fixes:
Authentication Error:
- Re-authenticate data source
- Check credentials haven't expired
- Verify permissions
Network Timeout:
- Check network stability
- Increase timeout settings
- Retry with smaller batches
Storage Full:
- Clean up old data
- Increase storage quota
- Archive processed files
Best Practices
- Use incremental sync for regular updates
- Schedule syncs during low-traffic periods
- Monitor sync performance regularly
- Set up alerts for failures
- Test sync configuration with small datasets first
- Document sync schedules and settings
- Review and clean up old data periodically