March 1, 2026
Data Hub
Data Hub is the foundation of the iDataWorkers platform — it connects, unifies, and manages all your data sources in a single, accessible layer. Whether your data lives in enterprise systems, databases, spreadsheets, or APIs, Data Hub brings it together so every other module can work with clean, consistent data.
Supported Data Sources
Data Hub connects to a wide range of sources out of the box:
- Enterprise systems — Oracle Cloud, Oracle Fusion, SAP, Salesforce, Microsoft Dynamics
- Databases — PostgreSQL, MySQL, MongoDB, SQL Server, Snowflake, BigQuery
- Files — Excel (.xlsx), CSV, JSON, Parquet
- APIs — Any REST or GraphQL endpoint with authentication
- Cloud storage — AWS S3, Google Cloud Storage, Azure Blob
Need a source that is not listed? Check the Integrations marketplace or contact our team for custom connectors.
Adding a Connection
To connect a new data source:
- Navigate to Data Hub from the sidebar
- Click Add Connection in the top right
- Select your source type from the catalog
- Enter credentials and configure authentication (OAuth, API key, or direct credentials)
- Select the schemas, tables, or datasets you want to import
- Review the auto-detected field mappings and adjust if needed
- Click Connect to start the initial sync
Sync Options
Data Hub offers flexible sync strategies to keep your data fresh:
- Full sync — Re-imports the entire dataset on each run. Best for smaller datasets or when data changes unpredictably.
- Incremental sync — Only imports new or changed records since the last sync. Faster and more efficient for large datasets.
- Real-time streaming — Continuous sync via Data Sync module for time-sensitive data.
Schedule syncs to run hourly, daily, or weekly — or trigger them manually or via API.
Data Quality & Monitoring
Data Hub includes built-in data quality features:
- Schema validation — Detects when source schemas change and alerts you
- Null & type checks — Flags unexpected nulls, type mismatches, or format issues
- Sync history — Full audit trail of every sync run with row counts, duration, and error details
- Alerts — Get notified when syncs fail or data quality drops below thresholds
Best Practices
- Start with your most critical data sources — you can always add more later
- Use meaningful names for connections and unified fields to make them easy to find
- Set up incremental sync for large datasets to reduce load and sync time
- Enable schema change detection to catch breaking changes early
- Review sync history regularly to catch and resolve issues quickly
Next Steps
- Data Sync — Set up real-time pipelines and advanced ETL workflows
- Cognify — Analyze your unified data with AI-powered insights
- Copilot Dexi — Query your data using natural language