Avatar

danialmyid

Creator & Developer Community
About
Membership
Shop
Posts

Anatomi skills.md (Standar Komunitas)



🔍 1. Anatomi skills.md (Standar Komunitas)

File ini biasanya memiliki struktur:

---
name: "Data Analyst"
version: "1.2"
role: "Spesialis analisis data dengan Python, pandas, dan visualisasi"
priority: 1
---

## 🧠 Core Skills
- Pembersihan & transformasi data tabular
- Analisis statistik deskriptif & inferensial
- Visualisasi dengan Plotly/Matplotlib
- Optimasi query pandas untuk dataset >100MB

## 🔄 Workflow
1. Validasi skema data & jenis missing values
2. Terapkan imputasi/transformasi sesuai konteks bisnis
3. Hasilkan kode + penjelasan singkat per langkah
4. Sertakan peringatan jika asumsi tidak terpenuhi

## 🚫 Constraints
- Selalu gunakan type hints & docstring
- Jangan asumsikan kolom tanpa konfirmasi
- Output Python dalam blok kode dengan bahasa `python`
- Hindari library deprecated (misal: `df.append`)

## 📎 Examples
<example>
User: "Bersihkan kolom 'age' yang ada NaN & outlier"
Assistant: 
```python
import pandas as pd
import numpy as np

def clean_age(df: pd.DataFrame) -> pd.DataFrame:
    df['age'] = df['age'].replace([np.inf, -np.inf], np.nan)
    df['age'].fillna(df['age'].median(), inplace=True)
    # IQR capping
    Q1, Q3 = df['age'].quantile([0.25, 0.75])
    IQR = Q3 - Q1
    df.loc[df['age'] < Q1 - 1.5*IQR, 'age'] = Q1 - 1.5*IQR
    df.loc[df['age'] > Q3 + 1.5*IQR, 'age'] = Q3 + 1.5*IQR
    return df
```

🏗️ 2. Arsitektur Implementasi di Qwen Studio

[Frontend UI] → Upload/Edit `skills.md` → [Backend Parser] → Validasi Schema
       ↓
[Prompt Assembler] → Gabungkan: System Base + Skills + Context → [Qwen API Client]
       ↓
[Storage Layer] → Simpan per Project/User → Versioning & Toggle On/Off

Komponen Kunci:

Layer Teknologi/Approach
Parser python-frontmatter + pydantic + markdown-it-py
Validasi JSON Schema / Pydantic model
Prompt Assembly Template engine + token budgeting
Storage PostgreSQL/SQLite (JSONB untuk skills) + Redis (cache aktif)
Integrasi Qwen dashscope SDK atau REST API (role: "system")

💻 3. Implementasi Minimal (Python)

# skills_parser.py
import yaml
import re
from pydantic import BaseModel, Field, validator
from typing import List

class SkillsSchema(BaseModel):
    name: str
    version: str
    role: str
    priority: int = Field(ge=0, le=10, default=5)
    core_skills: List[str]
    workflow: List[str]
    constraints: List[str]
    examples: str = ""

    @validator('core_skills', 'workflow', 'constraints', pre=True)
    def parse_list(cls, v):
        return [line.strip().lstrip('- ') for line in v.split('\n') if line.strip()]

def parse_skills_md(md_text: str) -> SkillsSchema:
    # Extract frontmatter
    match = re.match(r'^---\n(.*?)\n---\s*(.*)', md_text, re.DOTALL)
    if not match:
        raise ValueError("Format skills.md tidak valid: hilang frontmatter YAML")
    
    frontmatter_yaml, body = match.groups()
    data = yaml.safe_load(frontmatter_yaml)
    
    # Extract sections
    sections = {}
    for line in body.split('\n'):
        if line.startswith('## '):
            current_section = line.strip('## ').lower().replace(' ', '_')
            sections[current_section] = []
        elif current_section in sections:
            sections[current_section].append(line.strip())
    
    # Map to schema
    parsed = {
        "name": data.get("name", "Default"),
        "version": data.get("version", "1.0"),
        "role": data.get("role", ""),
        "priority": data.get("priority", 5),
        "core_skills": sections.get("core_skills", []),
        "workflow": sections.get("workflow", []),
        "constraints": sections.get("constraints", []),
        "examples": body # raw examples for reference
    }
    return SkillsSchema(**parsed)

def assemble_system_prompt(skills: SkillsSchema, base_prompt: str = "") -> str:
    prompt_parts = [base_prompt.strip()] if base_prompt else []
    prompt_parts.append(f"### 🎭 PERAN\n{skills.role}")
    prompt_parts.append(f"### 🧩 KAPABILITAS UTAMA\n" + "\n".join(f"- {s}" for s in skills.core_skills))
    prompt_parts.append(f"### 🔄 ALUR KERJA\n" + "\n".join(f"{i+1}. {s}" for i, s in enumerate(skills.workflow)))
    prompt_parts.append(f"### 🚫 BATASAN\n" + "\n".join(f"- {c}" for c in skills.constraints))
    if skills.examples:
        prompt_parts.append(f"### 📎 CONTOH PANDUAN\n{skills.examples}")
    return "\n\n".join(p.strip() for p in prompt_parts if p.strip())

Integrasi ke Qwen API:

from dashscope import Generation

def call_qwen_with_skills(user_input: str, skills_md: str):
    skills = parse_skills_md(skills_md)
    system_prompt = assemble_system_prompt(skills)
    
    response = Generation.call(
        model="qwen-max",
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_input}
        ],
        temperature=0.3,
        max_tokens=2048
    )
    return response.output.text

🛡️ 4. Pertimbangan Produksi (Critical)

Aspek Solusi
Token Limit Gunakan sliding window, prioritaskan constraints & workflow, kompres examples jika >2k token
Prompt Injection Sanitasi MD: hapus tag HTML, batasi panjang, block {{}}, validasi YAML strict
Multi-Skills Urutkan berdasarkan priority, gabungkan dengan separator ===, deduplikasi aturan konflik
Versioning Simpan hash MD5 per versi, rollback UI, audit trail
Qwen Studio Integration Manfaatkan fitur Custom System Prompt atau Project Config jika tersedia. Jika tidak, buat middleware API wrapper

📅 5. Roadmap Implementasi di Qwen Studio

  1. Prototype (1-2 minggu): Parser + Validator + CLI test dengan dashscope
  2. UI Integration (2 minggu): Editor Monaco Markdown, preview prompt, toggle on/off per project
  3. Storage & Versioning (1 minggu): DB schema, Git-like version diff, restore point
  4. Optimization (1 minggu): Token budgeting, async validation, caching skills aktif
  5. Security & Audit (1 minggu): Sanitizer pipeline, prompt injection test suite, rate limiting
  6. Deployment & Docs: Internal wiki, user guide, contoh template per domain (coding, research, design, dll)

🔗 Referensi Teknis