0 Votes

Wiki source code of Friday 14 February 2025

Last modified by dennis yoshikawa on 2025/02/14 01:23

Hide last authors
Abdurachman Putra 1.1 1 = Weekly Report 14 February 2025 =
2
3
Abdurachman Putra 2.4 4 == (% id="cke_bm_29973S" style="display:none" %) (%%)Data Engineer ==
5
6 >What Have You Done in This Week?
7
8 (((
9 {{success}}
10 What Have You Done?
11 {{/success}}
12
dennis yoshikawa 6.2 13 * Airflow-airbyte maintenance
Abdurachman Putra 2.4 14
dennis yoshikawa 6.2 15 ~1. create task for delete gcs folder table dharmadexa
16 2. ⁠change incremental method in airbyte for tb)
17
dennis yoshikawa 6.3 18 * Dkonsul
dennis yoshikawa 6.2 19
dennis yoshikawa 8.1 20 ~1. Doctor incentive - Deploy transformation code to production.
21 2. Doctor incentive - Final QC with partnership team. 
dennis yoshikawa 7.1 22 3. Dataset migrate dataset for dashboard (Herbawell Dashboard, BRI Medika DKONSUL)
dennis yoshikawa 6.3 23
dennis yoshikawa 6.2 24 * ⁠*D2D*
25
26 ~1. enhance logic stickness rate (mart)
27 2. ⁠weekly regrup w/product
28
29 * MCN
30
31 ~1. troubleshoot on DAG and resource
32 2. ⁠repointing new database for all pipeline existing
33
34 * Screening
35
36 ~1. regroup w/PM and DA for validate data and discuss funneling data
37 2. enhance logic fact table dharmadexa
38 3. ⁠preparation dharmadexa data phase 3 (funneling)
39 4. ⁠testing geo.py to get city from long lat
40 5. ⁠migrasi dataset cleaned_screening_merck.dim_screening_stunting to CH
41
Abdurachman Putra 2.4 42 {{warning}}
43 What Issues You Have?
44 {{/warning}}
45
dennis yoshikawa 6.5 46 1. That's difficulty in finding open source data quality tools is that most of them are paid.
Abdurachman Putra 2.4 47
48 {{info}}
49 What Next You Will Do? (Optional)
50 {{/info}}
51
dennis yoshikawa 6.6 52 1. Explore more for finding open source data quality tools
53 1. Tuning query for reduce cost bigquery
Abdurachman Putra 2.4 54
55 **What Support You Need? (Optional)**
Abdurachman Putra 3.1 56
57
58 ----
59
60
Abdurachman Putra 2.4 61 )))
62
63
steven hasan 3.2 64 == (% id="cke_bm_29973S" style="display:none" %) (%%)Data Analyst ==
Abdurachman Putra 1.1 65
Abdurachman Putra 2.2 66 >What Have You Done in This Week?
67
68 (((
Abdurachman Putra 3.1 69 {{success}}
Abdurachman Putra 2.2 70 What Have You Done?
Abdurachman Putra 3.1 71 {{/success}}
Abdurachman Putra 2.2 72
steven hasan 3.2 73 1. **Dharma Dexa Phase 2** – Completed the second phase of Dharma Dexa.
74 1. **Migration from Looker to Metabase** – Transitioning data visualization and analytics from Looker to Metabase.
75 1. **Redesigned DKonsul Data** – Improved the structure and organization of DKonsul data.
76 1. **Ad-hoc Requests** – Handled various on-demand data requests.
Abdurachman Putra 1.3 77
Abdurachman Putra 3.1 78 {{warning}}
Abdurachman Putra 1.3 79 What Issues You Have?
Abdurachman Putra 3.1 80 {{/warning}}
Abdurachman Putra 1.3 81
steven hasan 3.2 82 1. **Metabase Limitations** – Limited chart options and flexibility in customization, particularly a lack of aggregation functions.
Abdurachman Putra 1.3 83
Abdurachman Putra 3.1 84 {{info}}
Abdurachman Putra 1.3 85 What Next You Will Do? (Optional)
Abdurachman Putra 3.1 86 {{/info}}
Abdurachman Putra 1.3 87
steven hasan 3.2 88 1. **Continue Redesigning DKonsul Data**
89 1*. Refining the data funnel from consultation → prescription → transaction.
90 1. **Continue Migration to Metabase** – Ensuring a smooth transition from Looker to Metabase.
91 1. **Dharma Dexa Phase 3** – Proceeding with the next phase of the Dharma Dexa project.
92 1. **AppSheet MCN Visit Tracker Dashboard** – Developing and optimizing the dashboard.
Abdurachman Putra 1.3 93
Abdurachman Putra 3.1 94 **What Support You Need? (Optional)**
steven hasan 3.2 95
96 1. **Data Validation** – Ensuring data accuracy and consistency.
97 1. **Dharma Dexa Screening Enhancements**
98 1*. Assigning a **new screening ID** for each event, especially if different questions and inputs are involved.
99 1*. Adding **location input (province, city)** for better analysis.
Abdurachman Putra 1.3 100 )))
101
102
Abdurachman Putra 3.1 103 ----
Abdurachman Putra 2.2 104
Abdurachman Putra 2.1 105 == Data Analyst & AI ==
106
Abdurachman Putra 2.2 107 >What Have You Done in This Week?
108
109 (((
Abdurachman Putra 2.3 110 {{success}}
Abdurachman Putra 2.2 111 What Have You Done?
Abdurachman Putra 2.3 112 {{/success}}
Abdurachman Putra 2.2 113
Haekal Yusril Faizin 4.3 114 **AUTOMARK**
Abdurachman Putra 2.1 115
Haekal Yusril Faizin 4.20 116 1. Deploy, evaluate and making documentation of **Screening Dharma Dexa API** to be accessed by Tech team. The API is deployed at [[https:~~/~~/datalake.ptgue.com/v1/users_dd>>https://datalake.ptgue.com/v1/users_dd]] and the **documentation is already sent to Tech team**. **The API could be accessed using prompt** to retrieve user that relevant to the prompt. Per 13 Feb 2024, **the RAG Accuracy is 87,50%**
Haekal Yusril Faizin 4.17 117 1. Deploy, evaluate and making documentation of **master user GUE Ecosystem** to be accessed by Tech team. The API is deployed at [[https:~~/~~/datalake.ptgue.com/v1/users>>https://datalake.ptgue.com/v1/users_dd]] and the **documentation is already sent to Tech team**. **The API is already tested by Tech team and have no issues.**
Haekal Yusril Faizin 4.3 118
Haekal Yusril Faizin 4.7 119 **MCN**
Haekal Yusril Faizin 4.3 120
Haekal Yusril Faizin 4.20 121 1. Fixing resource spike issue caused by scraping schedule. The issue is already fixed and pass the test for 3 days (wednesday, thursday, and friday). **Scraping time is decreased from 1 minute per user to the maximum of +-20 seconds per user.** **(Pairing with Syifa-DE)**
122 1. **Repoint, re-align, and redesign the pipeline and database** that being consumed for scrapping.** (Pairing with Syifa-DE)**
Haekal Yusril Faizin 4.7 123
Haekal Yusril Faizin 4.14 124 **Automation**
Haekal Yusril Faizin 4.7 125
Haekal Yusril Faizin 4.20 126 1. **Implement compliance mapping** with the newest data from compliance team
127 1. Data, flow, and script validation for user report performance doctor. **The sample report is already validated by dr.Astrid**
Haekal Yusril Faizin 4.13 128
Haekal Yusril Faizin 4.15 129 **Others**
Haekal Yusril Faizin 4.14 130
Haekal Yusril Faizin 4.15 131 1. Reqeust + enhance dashboard merck
132 1. Request dkonsul data for online doctors, transactions, prescriptions, Dexa prescriptions, comparison new users january
Haekal Yusril Faizin 4.17 133 1. Request data ICD-10
Haekal Yusril Faizin 4.3 134
Abdurachman Putra 2.3 135 {{warning}}
Abdurachman Putra 2.1 136 What Issues You Have?
Abdurachman Putra 2.3 137 {{/warning}}
Abdurachman Putra 2.1 138
Haekal Yusril Faizin 4.22 139 1. There is no info regarding database repointing of MCN from tech team. So, the data is not updated. Already solved by coordinating with Product Team
Haekal Yusril Faizin 5.1 140 1. Need to enhance the RAG accuracy to around 90%
Abdurachman Putra 2.1 141
Abdurachman Putra 2.3 142 {{info}}
Abdurachman Putra 2.1 143 What Next You Will Do? (Optional)
Abdurachman Putra 2.3 144 {{/info}}
Abdurachman Putra 2.1 145
Haekal Yusril Faizin 5.1 146 1. Increase RAG accuracy by adding more train query LLM
147 1. Start daily recurring scraping for tiktok dashboard
148 1. Start dkonsul insight next week
Abdurachman Putra 2.1 149
Abdurachman Putra 2.4 150 **What Support You Need? (Optional)**
Abdurachman Putra 2.2 151
Abdurachman Putra 6.1 152
153 === Summary ===
154
155 Berikut ringkasan laporan mingguan dari 14 Februari 2025 yang ditulis oleh Haekal Yusril Faizin. Laporan ini merangkum aktivitas dan isu dari tim Data Engineer, Data Analyst, dan Data Analyst & AI.
156
157 **Data Analyst**
158
159 * Menyelesaikan fase kedua Dharma Dexa.
160 * Memindahkan visualisasi data dan analitik dari Looker ke Metabase.
161 * Mendesain ulang data DKonsul untuk struktur dan organisasi yang lebih baik.
162 * Menangani berbagai permintaan data on-demand.
163
164 **Data Analyst & AI**
165
166 * Menerapkan, mengevaluasi, dan mendokumentasikan API Screening Dharma Dexa untuk diakses oleh tim Tech.
167 * Menerapkan, mengevaluasi, dan mendokumentasikan master user GUE Ecosystem untuk diakses oleh tim Tech.
168 * Memperbaiki masalah lonjakan sumber daya yang disebabkan oleh jadwal scraping.
169 * Merepoint, menyelaraskan ulang, dan mendesain ulang pipeline dan database yang digunakan untuk scraping.
170 * Menerapkan pemetaan kepatuhan dengan data terbaru dari tim kepatuhan.
171 * Memvalidasi data, alur, dan skrip untuk kinerja laporan pengguna dokter.
172
173 **Isu**
174
175 * Pilihan bagan dan fleksibilitas yang terbatas dalam kustomisasi Metabase.
176 * Kurangnya informasi mengenai database repointing MCN dari tim teknologi.
177 * Perlu meningkatkan akurasi RAG menjadi sekitar 90%.
178
179 **Langkah Selanjutnya**
180
181 * Melanjutkan desain ulang data DKonsul dan migrasi ke Metabase.
182 * Melanjutkan dengan Dharma Dexa Fase 3 dan mengembangkan Dasbor Pelacak Kunjungan AppSheet MCN.
183 * Meningkatkan akurasi RAG, memulai scraping berulang harian untuk dasbor TikTok, dan memulai wawasan DKonsul minggu depan.
184
185 **Dukungan yang Dibutuhkan**
186
187 * Validasi data dan peningkatan penyaringan Dharma Dexa.
Abdurachman Putra 2.1 188 )))