Wiki source code of Friday 14 February 2025
Last modified by dennis yoshikawa on 2025/02/14 01:23
Show last authors
| author | version | line-number | content |
|---|---|---|---|
| 1 | = Weekly Report 14 February 2025 = | ||
| 2 | |||
| 3 | |||
| 4 | == (% id="cke_bm_29973S" style="display:none" %) (%%)Data Engineer == | ||
| 5 | |||
| 6 | >What Have You Done in This Week? | ||
| 7 | |||
| 8 | ((( | ||
| 9 | {{success}} | ||
| 10 | What Have You Done? | ||
| 11 | {{/success}} | ||
| 12 | |||
| 13 | * Airflow-airbyte maintenance | ||
| 14 | |||
| 15 | ~1. create task for delete gcs folder table dharmadexa | ||
| 16 | 2. change incremental method in airbyte for tb) | ||
| 17 | |||
| 18 | * Dkonsul | ||
| 19 | |||
| 20 | ~1. Doctor incentive - Deploy transformation code to production. | ||
| 21 | 2. Doctor incentive - Final QC with partnership team. | ||
| 22 | 3. Dataset migrate dataset for dashboard (Herbawell Dashboard, BRI Medika DKONSUL) | ||
| 23 | |||
| 24 | * *D2D* | ||
| 25 | |||
| 26 | ~1. enhance logic stickness rate (mart) | ||
| 27 | 2. weekly regrup w/product | ||
| 28 | |||
| 29 | * MCN | ||
| 30 | |||
| 31 | ~1. troubleshoot on DAG and resource | ||
| 32 | 2. repointing new database for all pipeline existing | ||
| 33 | |||
| 34 | * Screening | ||
| 35 | |||
| 36 | ~1. regroup w/PM and DA for validate data and discuss funneling data | ||
| 37 | 2. enhance logic fact table dharmadexa | ||
| 38 | 3. preparation dharmadexa data phase 3 (funneling) | ||
| 39 | 4. testing geo.py to get city from long lat | ||
| 40 | 5. migrasi dataset cleaned_screening_merck.dim_screening_stunting to CH | ||
| 41 | |||
| 42 | {{warning}} | ||
| 43 | What Issues You Have? | ||
| 44 | {{/warning}} | ||
| 45 | |||
| 46 | 1. That's difficulty in finding open source data quality tools is that most of them are paid. | ||
| 47 | |||
| 48 | {{info}} | ||
| 49 | What Next You Will Do? (Optional) | ||
| 50 | {{/info}} | ||
| 51 | |||
| 52 | 1. Explore more for finding open source data quality tools | ||
| 53 | 1. Tuning query for reduce cost bigquery | ||
| 54 | |||
| 55 | **What Support You Need? (Optional)** | ||
| 56 | |||
| 57 | |||
| 58 | ---- | ||
| 59 | |||
| 60 | |||
| 61 | ))) | ||
| 62 | |||
| 63 | |||
| 64 | == (% id="cke_bm_29973S" style="display:none" %) (%%)Data Analyst == | ||
| 65 | |||
| 66 | >What Have You Done in This Week? | ||
| 67 | |||
| 68 | ((( | ||
| 69 | {{success}} | ||
| 70 | What Have You Done? | ||
| 71 | {{/success}} | ||
| 72 | |||
| 73 | 1. **Dharma Dexa Phase 2** – Completed the second phase of Dharma Dexa. | ||
| 74 | 1. **Migration from Looker to Metabase** – Transitioning data visualization and analytics from Looker to Metabase. | ||
| 75 | 1. **Redesigned DKonsul Data** – Improved the structure and organization of DKonsul data. | ||
| 76 | 1. **Ad-hoc Requests** – Handled various on-demand data requests. | ||
| 77 | |||
| 78 | {{warning}} | ||
| 79 | What Issues You Have? | ||
| 80 | {{/warning}} | ||
| 81 | |||
| 82 | 1. **Metabase Limitations** – Limited chart options and flexibility in customization, particularly a lack of aggregation functions. | ||
| 83 | |||
| 84 | {{info}} | ||
| 85 | What Next You Will Do? (Optional) | ||
| 86 | {{/info}} | ||
| 87 | |||
| 88 | 1. **Continue Redesigning DKonsul Data** | ||
| 89 | 1*. Refining the data funnel from consultation → prescription → transaction. | ||
| 90 | 1. **Continue Migration to Metabase** – Ensuring a smooth transition from Looker to Metabase. | ||
| 91 | 1. **Dharma Dexa Phase 3** – Proceeding with the next phase of the Dharma Dexa project. | ||
| 92 | 1. **AppSheet MCN Visit Tracker Dashboard** – Developing and optimizing the dashboard. | ||
| 93 | |||
| 94 | **What Support You Need? (Optional)** | ||
| 95 | |||
| 96 | 1. **Data Validation** – Ensuring data accuracy and consistency. | ||
| 97 | 1. **Dharma Dexa Screening Enhancements** | ||
| 98 | 1*. Assigning a **new screening ID** for each event, especially if different questions and inputs are involved. | ||
| 99 | 1*. Adding **location input (province, city)** for better analysis. | ||
| 100 | ))) | ||
| 101 | |||
| 102 | |||
| 103 | ---- | ||
| 104 | |||
| 105 | == Data Analyst & AI == | ||
| 106 | |||
| 107 | >What Have You Done in This Week? | ||
| 108 | |||
| 109 | ((( | ||
| 110 | {{success}} | ||
| 111 | What Have You Done? | ||
| 112 | {{/success}} | ||
| 113 | |||
| 114 | **AUTOMARK** | ||
| 115 | |||
| 116 | 1. Deploy, evaluate and making documentation of **Screening Dharma Dexa API** to be accessed by Tech team. The API is deployed at [[https:~~/~~/datalake.ptgue.com/v1/users_dd>>https://datalake.ptgue.com/v1/users_dd]] and the **documentation is already sent to Tech team**. **The API could be accessed using prompt** to retrieve user that relevant to the prompt. Per 13 Feb 2024, **the RAG Accuracy is 87,50%** | ||
| 117 | 1. Deploy, evaluate and making documentation of **master user GUE Ecosystem** to be accessed by Tech team. The API is deployed at [[https:~~/~~/datalake.ptgue.com/v1/users>>https://datalake.ptgue.com/v1/users_dd]] and the **documentation is already sent to Tech team**. **The API is already tested by Tech team and have no issues.** | ||
| 118 | |||
| 119 | **MCN** | ||
| 120 | |||
| 121 | 1. Fixing resource spike issue caused by scraping schedule. The issue is already fixed and pass the test for 3 days (wednesday, thursday, and friday). **Scraping time is decreased from 1 minute per user to the maximum of +-20 seconds per user.** **(Pairing with Syifa-DE)** | ||
| 122 | 1. **Repoint, re-align, and redesign the pipeline and database** that being consumed for scrapping.** (Pairing with Syifa-DE)** | ||
| 123 | |||
| 124 | **Automation** | ||
| 125 | |||
| 126 | 1. **Implement compliance mapping** with the newest data from compliance team | ||
| 127 | 1. Data, flow, and script validation for user report performance doctor. **The sample report is already validated by dr.Astrid** | ||
| 128 | |||
| 129 | **Others** | ||
| 130 | |||
| 131 | 1. Reqeust + enhance dashboard merck | ||
| 132 | 1. Request dkonsul data for online doctors, transactions, prescriptions, Dexa prescriptions, comparison new users january | ||
| 133 | 1. Request data ICD-10 | ||
| 134 | |||
| 135 | {{warning}} | ||
| 136 | What Issues You Have? | ||
| 137 | {{/warning}} | ||
| 138 | |||
| 139 | 1. There is no info regarding database repointing of MCN from tech team. So, the data is not updated. Already solved by coordinating with Product Team | ||
| 140 | 1. Need to enhance the RAG accuracy to around 90% | ||
| 141 | |||
| 142 | {{info}} | ||
| 143 | What Next You Will Do? (Optional) | ||
| 144 | {{/info}} | ||
| 145 | |||
| 146 | 1. Increase RAG accuracy by adding more train query LLM | ||
| 147 | 1. Start daily recurring scraping for tiktok dashboard | ||
| 148 | 1. Start dkonsul insight next week | ||
| 149 | |||
| 150 | **What Support You Need? (Optional)** | ||
| 151 | |||
| 152 | |||
| 153 | === Summary === | ||
| 154 | |||
| 155 | Berikut ringkasan laporan mingguan dari 14 Februari 2025 yang ditulis oleh Haekal Yusril Faizin. Laporan ini merangkum aktivitas dan isu dari tim Data Engineer, Data Analyst, dan Data Analyst & AI. | ||
| 156 | |||
| 157 | **Data Analyst** | ||
| 158 | |||
| 159 | * Menyelesaikan fase kedua Dharma Dexa. | ||
| 160 | * Memindahkan visualisasi data dan analitik dari Looker ke Metabase. | ||
| 161 | * Mendesain ulang data DKonsul untuk struktur dan organisasi yang lebih baik. | ||
| 162 | * Menangani berbagai permintaan data on-demand. | ||
| 163 | |||
| 164 | **Data Analyst & AI** | ||
| 165 | |||
| 166 | * Menerapkan, mengevaluasi, dan mendokumentasikan API Screening Dharma Dexa untuk diakses oleh tim Tech. | ||
| 167 | * Menerapkan, mengevaluasi, dan mendokumentasikan master user GUE Ecosystem untuk diakses oleh tim Tech. | ||
| 168 | * Memperbaiki masalah lonjakan sumber daya yang disebabkan oleh jadwal scraping. | ||
| 169 | * Merepoint, menyelaraskan ulang, dan mendesain ulang pipeline dan database yang digunakan untuk scraping. | ||
| 170 | * Menerapkan pemetaan kepatuhan dengan data terbaru dari tim kepatuhan. | ||
| 171 | * Memvalidasi data, alur, dan skrip untuk kinerja laporan pengguna dokter. | ||
| 172 | |||
| 173 | **Isu** | ||
| 174 | |||
| 175 | * Pilihan bagan dan fleksibilitas yang terbatas dalam kustomisasi Metabase. | ||
| 176 | * Kurangnya informasi mengenai database repointing MCN dari tim teknologi. | ||
| 177 | * Perlu meningkatkan akurasi RAG menjadi sekitar 90%. | ||
| 178 | |||
| 179 | **Langkah Selanjutnya** | ||
| 180 | |||
| 181 | * Melanjutkan desain ulang data DKonsul dan migrasi ke Metabase. | ||
| 182 | * Melanjutkan dengan Dharma Dexa Fase 3 dan mengembangkan Dasbor Pelacak Kunjungan AppSheet MCN. | ||
| 183 | * Meningkatkan akurasi RAG, memulai scraping berulang harian untuk dasbor TikTok, dan memulai wawasan DKonsul minggu depan. | ||
| 184 | |||
| 185 | **Dukungan yang Dibutuhkan** | ||
| 186 | |||
| 187 | * Validasi data dan peningkatan penyaringan Dharma Dexa. | ||
| 188 | ))) |