Data Hub

Data Hub is the foundation of the iDataWorkers platform — it connects, unifies, and manages all your data sources in a single, accessible layer. Whether your data lives in enterprise systems, databases, spreadsheets, or APIs, Data Hub brings it together so every other module can work with clean, consistent data.

Supported Data Sources

Data Hub connects to a wide range of sources out of the box:

Enterprise systems — Oracle Cloud, Oracle Fusion, SAP, Salesforce, Microsoft Dynamics
Databases — PostgreSQL, MySQL, MongoDB, SQL Server, Snowflake, BigQuery
Files — Excel (.xlsx), CSV, JSON, Parquet
APIs — Any REST or GraphQL endpoint with authentication
Cloud storage — AWS S3, Google Cloud Storage, Azure Blob

Need a source that is not listed? Check the Integrations marketplace or contact our team for custom connectors.

Adding a Connection

To connect a new data source:

Navigate to Data Hub from the sidebar
Click Add Connection in the top right
Select your source type from the catalog
Enter credentials and configure authentication (OAuth, API key, or direct credentials)
Select the schemas, tables, or datasets you want to import
Review the auto-detected field mappings and adjust if needed
Click Connect to start the initial sync

Sync Options

Data Hub offers flexible sync strategies to keep your data fresh:

Full sync — Re-imports the entire dataset on each run. Best for smaller datasets or when data changes unpredictably.
Incremental sync — Only imports new or changed records since the last sync. Faster and more efficient for large datasets.
Real-time streaming — Continuous sync via Data Sync module for time-sensitive data.

Schedule syncs to run hourly, daily, or weekly — or trigger them manually or via API.

Data Quality & Monitoring

Data Hub includes built-in data quality features:

Schema validation — Detects when source schemas change and alerts you
Null & type checks — Flags unexpected nulls, type mismatches, or format issues
Sync history — Full audit trail of every sync run with row counts, duration, and error details
Alerts — Get notified when syncs fail or data quality drops below thresholds

Best Practices

Start with your most critical data sources — you can always add more later
Use meaningful names for connections and unified fields to make them easy to find
Set up incremental sync for large datasets to reduce load and sync time
Enable schema change detection to catch breaking changes early
Review sync history regularly to catch and resolve issues quickly

Next Steps

Data Sync — Set up real-time pipelines and advanced ETL workflows
Cognify — Analyze your unified data with AI-powered insights
Copilot Dexi — Query your data using natural language