CodeAutoGeneratedSampleEvaluator
About 236 wordsLess than 1 minute
2025-10-09
📘 Overview
CodeAutoGeneratedSampleEvaluator is an operator that evaluates code samples based on auto-generation markers. It scans for common phrases indicating automatically generated code within file headers and assigns scores, which can be used to filter out such files from a dataset.
__init__ function
def __init__(self, is_generated_func: Optional[Callable[[], bool]] = None):| Parameter Name | Type | Default | Description |
|---|---|---|---|
| is_generated_func | Optional[Callable[[], bool]] | None | An optional external function that can be provided to perform a custom check for auto-generated code. |
Prompt Template Descriptions
| Prompt Template Name | Primary Purpose | Applicable Scenarios | Feature Description |
|---|---|---|---|
run function
def run(self, storage: DataFlowStorage, input_key: str):| Name | Type | Default | Description |
|---|---|---|---|
| storage | DataFlowStorage | Required | The data flow storage instance, responsible for reading and writing data. |
| input_key | str | Required | The name of the input column containing the code to be evaluated. |
🧠 Example Usage
🧾 Default Output Format (Output Format)
| Field | Type | Description |
|---|---|---|
| input_key | any | The original input data. |
| CodeAutoGeneratedMarkerCount | int | The number of auto-generation markers detected in the code sample. |
| CodeAutoGeneratedScore | float | A score from 0.0 to 1.0, where 1.0 indicates the code is likely not auto-generated, and 0.0 indicates it is. |
Example Input:
{
"code_lines": [
"# This file is automatically generated by the build system.",
"def some_function():",
" return True"
]
}Example Output:
{
"code_lines": [
"# This file is automatically generated by the build system.",
"def some_function():",
" return True"
],
"CodeAutoGeneratedMarkerCount": 1,
"CodeAutoGeneratedScore": 0.0
}
