|
| 1 | +# Day 16: BigQuery Setup Guide |
| 2 | + |
| 3 | +## Overview |
| 4 | +This guide helps you upload the Day 6 SaaS metrics data to BigQuery for use with Metabase Cloud. |
| 5 | + |
| 6 | +--- |
| 7 | + |
| 8 | +## Step 1: Create BigQuery Dataset |
| 9 | + |
| 10 | +### Using GCP Console: |
| 11 | +1. Go to [BigQuery Console](https://console.cloud.google.com/bigquery) |
| 12 | +2. Select your project (or create a new one) |
| 13 | +3. Click "Create Dataset" |
| 14 | +4. Use these settings: |
| 15 | + - **Dataset ID**: `day16_saas_metrics` |
| 16 | + - **Location**: `US` (or your preferred region) |
| 17 | + - **Default table expiration**: Never |
| 18 | +5. Click "Create Dataset" |
| 19 | + |
| 20 | +### Using gcloud CLI: |
| 21 | +```bash |
| 22 | +# Set your project ID |
| 23 | +export DAY16_GCP_PROJECT_ID="your-project-id" |
| 24 | + |
| 25 | +# Create dataset |
| 26 | +bq mk \ |
| 27 | + --dataset \ |
| 28 | + --location=US \ |
| 29 | + --description="Day 16: SaaS Health Metrics for Metabase Dashboard" \ |
| 30 | + ${DAY16_GCP_PROJECT_ID}:day16_saas_metrics |
| 31 | +``` |
| 32 | + |
| 33 | +--- |
| 34 | + |
| 35 | +## Step 2: Upload CSV Files to BigQuery |
| 36 | + |
| 37 | +You have 8 CSV files in `day16/data/` that need to be uploaded. |
| 38 | + |
| 39 | +### Option A: Using GCP Console (Easiest) |
| 40 | + |
| 41 | +For each CSV file, follow these steps: |
| 42 | + |
| 43 | +1. In BigQuery, select your `day16_saas_metrics` dataset |
| 44 | +2. Click "Create Table" |
| 45 | +3. Configure: |
| 46 | + - **Source**: Upload |
| 47 | + - **Select file**: Choose the CSV file |
| 48 | + - **File format**: CSV |
| 49 | + - **Table name**: Use the filename without `.csv` (e.g., `day06_dashboard_kpis`) |
| 50 | + - **Schema**: Auto-detect |
| 51 | + - **Advanced options** → Header rows to skip: `1` |
| 52 | +4. Click "Create Table" |
| 53 | + |
| 54 | +Repeat for all 8 files: |
| 55 | +- `day06_dashboard_kpis.csv` |
| 56 | +- `day06_mrr_summary.csv` |
| 57 | +- `day06_retention_curves.csv` |
| 58 | +- `day06_churn_by_cohort.csv` |
| 59 | +- `day06_customer_health.csv` |
| 60 | +- `day06_customers.csv` |
| 61 | +- `day06_subscriptions.csv` |
| 62 | +- `day06_mrr_movements.csv` |
| 63 | + |
| 64 | +### Option B: Using bq CLI (Faster for bulk upload) |
| 65 | + |
| 66 | +```bash |
| 67 | +# Navigate to the data directory |
| 68 | +cd day16/data |
| 69 | + |
| 70 | +# Set your project ID |
| 71 | +export DAY16_GCP_PROJECT_ID="your-project-id" |
| 72 | + |
| 73 | +# Upload all tables |
| 74 | +bq load --source_format=CSV --autodetect --skip_leading_rows=1 \ |
| 75 | + ${DAY16_GCP_PROJECT_ID}:day16_saas_metrics.day06_dashboard_kpis \ |
| 76 | + day06_dashboard_kpis.csv |
| 77 | + |
| 78 | +bq load --source_format=CSV --autodetect --skip_leading_rows=1 \ |
| 79 | + ${DAY16_GCP_PROJECT_ID}:day16_saas_metrics.day06_mrr_summary \ |
| 80 | + day06_mrr_summary.csv |
| 81 | + |
| 82 | +bq load --source_format=CSV --autodetect --skip_leading_rows=1 \ |
| 83 | + ${DAY16_GCP_PROJECT_ID}:day16_saas_metrics.day06_retention_curves \ |
| 84 | + day06_retention_curves.csv |
| 85 | + |
| 86 | +bq load --source_format=CSV --autodetect --skip_leading_rows=1 \ |
| 87 | + ${DAY16_GCP_PROJECT_ID}:day16_saas_metrics.day06_churn_by_cohort \ |
| 88 | + day06_churn_by_cohort.csv |
| 89 | + |
| 90 | +bq load --source_format=CSV --autodetect --skip_leading_rows=1 \ |
| 91 | + ${DAY16_GCP_PROJECT_ID}:day16_saas_metrics.day06_customer_health \ |
| 92 | + day06_customer_health.csv |
| 93 | + |
| 94 | +bq load --source_format=CSV --autodetect --skip_leading_rows=1 \ |
| 95 | + ${DAY16_GCP_PROJECT_ID}:day16_saas_metrics.day06_customers \ |
| 96 | + day06_customers.csv |
| 97 | + |
| 98 | +bq load --source_format=CSV --autodetect --skip_leading_rows=1 \ |
| 99 | + ${DAY16_GCP_PROJECT_ID}:day16_saas_metrics.day06_subscriptions \ |
| 100 | + day06_subscriptions.csv |
| 101 | + |
| 102 | +bq load --source_format=CSV --autodetect --skip_leading_rows=1 \ |
| 103 | + ${DAY16_GCP_PROJECT_ID}:day16_saas_metrics.day06_mrr_movements \ |
| 104 | + day06_mrr_movements.csv |
| 105 | +``` |
| 106 | + |
| 107 | +--- |
| 108 | + |
| 109 | +## Step 3: Verify Upload |
| 110 | + |
| 111 | +```bash |
| 112 | +# List tables |
| 113 | +bq ls ${DAY16_GCP_PROJECT_ID}:day16_saas_metrics |
| 114 | + |
| 115 | +# Check row counts |
| 116 | +bq query --use_legacy_sql=false \ |
| 117 | + "SELECT 'dashboard_kpis' as table_name, COUNT(*) as row_count FROM \`${DAY16_GCP_PROJECT_ID}.day16_saas_metrics.day06_dashboard_kpis\` |
| 118 | + UNION ALL |
| 119 | + SELECT 'mrr_summary', COUNT(*) FROM \`${DAY16_GCP_PROJECT_ID}.day16_saas_metrics.day06_mrr_summary\` |
| 120 | + UNION ALL |
| 121 | + SELECT 'retention_curves', COUNT(*) FROM \`${DAY16_GCP_PROJECT_ID}.day16_saas_metrics.day06_retention_curves\` |
| 122 | + UNION ALL |
| 123 | + SELECT 'churn_by_cohort', COUNT(*) FROM \`${DAY16_GCP_PROJECT_ID}.day16_saas_metrics.day06_churn_by_cohort\` |
| 124 | + UNION ALL |
| 125 | + SELECT 'customer_health', COUNT(*) FROM \`${DAY16_GCP_PROJECT_ID}.day16_saas_metrics.day06_customer_health\` |
| 126 | + UNION ALL |
| 127 | + SELECT 'customers', COUNT(*) FROM \`${DAY16_GCP_PROJECT_ID}.day16_saas_metrics.day06_customers\` |
| 128 | + UNION ALL |
| 129 | + SELECT 'subscriptions', COUNT(*) FROM \`${DAY16_GCP_PROJECT_ID}.day16_saas_metrics.day06_subscriptions\` |
| 130 | + UNION ALL |
| 131 | + SELECT 'mrr_movements', COUNT(*) FROM \`${DAY16_GCP_PROJECT_ID}.day16_saas_metrics.day06_mrr_movements\`" |
| 132 | +``` |
| 133 | + |
| 134 | +Expected row counts: |
| 135 | +- `dashboard_kpis`: 1 |
| 136 | +- `mrr_summary`: 24 |
| 137 | +- `retention_curves`: 299 |
| 138 | +- `churn_by_cohort`: 52 |
| 139 | +- `customer_health`: 500 |
| 140 | +- `customers`: 500 |
| 141 | +- `subscriptions`: 641 |
| 142 | +- `mrr_movements`: 24 |
| 143 | + |
| 144 | +--- |
| 145 | + |
| 146 | +## Step 4: Create Service Account for Metabase |
| 147 | + |
| 148 | +Metabase Cloud needs credentials to access your BigQuery data. |
| 149 | + |
| 150 | +### Create Service Account: |
| 151 | +```bash |
| 152 | +# Set variables |
| 153 | +export DAY16_GCP_PROJECT_ID="your-project-id" |
| 154 | +export DAY16_SERVICE_ACCOUNT="metabase-day16" |
| 155 | + |
| 156 | +# Create service account |
| 157 | +gcloud iam service-accounts create ${DAY16_SERVICE_ACCOUNT} \ |
| 158 | + --display-name="Metabase Day 16 Dashboard" \ |
| 159 | + --project=${DAY16_GCP_PROJECT_ID} |
| 160 | + |
| 161 | +# Grant BigQuery Data Viewer role |
| 162 | +gcloud projects add-iam-policy-binding ${DAY16_GCP_PROJECT_ID} \ |
| 163 | + --member="serviceAccount:${DAY16_SERVICE_ACCOUNT}@${DAY16_GCP_PROJECT_ID}.iam.gserviceaccount.com" \ |
| 164 | + --role="roles/bigquery.dataViewer" |
| 165 | + |
| 166 | +# Grant BigQuery Job User role (for running queries) |
| 167 | +gcloud projects add-iam-policy-binding ${DAY16_GCP_PROJECT_ID} \ |
| 168 | + --member="serviceAccount:${DAY16_SERVICE_ACCOUNT}@${DAY16_GCP_PROJECT_ID}.iam.gserviceaccount.com" \ |
| 169 | + --role="roles/bigquery.jobUser" |
| 170 | + |
| 171 | +# Create and download key |
| 172 | +gcloud iam service-accounts keys create day16_metabase_key.json \ |
| 173 | + --iam-account=${DAY16_SERVICE_ACCOUNT}@${DAY16_GCP_PROJECT_ID}.iam.gserviceaccount.com |
| 174 | +``` |
| 175 | + |
| 176 | +**Save the `day16_metabase_key.json` file** - you'll need it for Metabase Cloud connection. |
| 177 | + |
| 178 | +--- |
| 179 | + |
| 180 | +## Step 5: Connect Metabase Cloud to BigQuery |
| 181 | + |
| 182 | +1. Go to [Metabase Cloud](https://www.metabase.com/start/) |
| 183 | +2. Sign up or log in |
| 184 | +3. Click "Add a database" |
| 185 | +4. Select "BigQuery" |
| 186 | +5. Configure: |
| 187 | + - **Display Name**: Day 16 - SaaS Health Metrics |
| 188 | + - **Project ID**: Your GCP project ID |
| 189 | + - **Dataset ID**: `day16_saas_metrics` |
| 190 | + - **Service Account JSON**: Upload `day16_metabase_key.json` |
| 191 | +6. Click "Save" |
| 192 | +7. Click "Sync database schema now" |
| 193 | + |
| 194 | +--- |
| 195 | + |
| 196 | +## Next Steps |
| 197 | + |
| 198 | +Once connected, you're ready to create dashboard cards using the SQL queries in: |
| 199 | +- `day16_QUERIES_metabase.md` |
| 200 | + |
| 201 | +--- |
| 202 | + |
| 203 | +## Troubleshooting |
| 204 | + |
| 205 | +### "Permission denied" error: |
| 206 | +- Verify service account has both `bigquery.dataViewer` and `bigquery.jobUser` roles |
| 207 | +- Check that the service account JSON key is valid |
| 208 | + |
| 209 | +### "Dataset not found": |
| 210 | +- Ensure dataset ID is exactly `day16_saas_metrics` |
| 211 | +- Verify dataset is in the same project as your service account |
| 212 | + |
| 213 | +### "Table not found": |
| 214 | +- Run the verification query in Step 3 to confirm all tables uploaded successfully |
| 215 | +- Check table names match exactly (case-sensitive) |
| 216 | + |
| 217 | +--- |
| 218 | + |
| 219 | +## Cost Considerations |
| 220 | + |
| 221 | +- **Storage**: ~1 MB total (negligible cost) |
| 222 | +- **Queries**: Metabase preview queries are typically <10 MB scanned |
| 223 | +- **Expected monthly cost**: <$1 USD (likely free tier) |
| 224 | + |
| 225 | +--- |
| 226 | + |
| 227 | +## Security Notes |
| 228 | + |
| 229 | +- ⚠️ **DO NOT commit `day16_metabase_key.json` to git** |
| 230 | +- Add to `.gitignore`: `*.json` in day16 folder |
| 231 | +- Service account has read-only access (dataViewer role only) |
| 232 | +- Consider setting up BigQuery authorized views for production |
| 233 | + |
| 234 | +--- |
| 235 | + |
| 236 | +Built for Christmas Data Advent 2025 - Day 16 (Project 4A) |
0 commit comments