CodeAutoGeneratedSampleEvaluator

About 236 wordsLess than 1 minute

2025-10-09

📘 Overview

CodeAutoGeneratedSampleEvaluator is an operator that evaluates code samples based on auto-generation markers. It scans for common phrases indicating automatically generated code within file headers and assigns scores, which can be used to filter out such files from a dataset.

`init` function

def __init__(self, is_generated_func: Optional[Callable[[], bool]] = None):

Parameter Name	Type	Default	Description
is_generated_func	Optional[Callable[[], bool]]	None	An optional external function that can be provided to perform a custom check for auto-generated code.

Prompt Template Descriptions

Prompt Template Name	Primary Purpose	Applicable Scenarios	Feature Description

`run` function

def run(self, storage: DataFlowStorage, input_key: str):

Name	Type	Default	Description
storage	DataFlowStorage	Required	The data flow storage instance, responsible for reading and writing data.
input_key	str	Required	The name of the input column containing the code to be evaluated.

🧠 Example Usage

🧾 Default Output Format (Output Format)

Field	Type	Description
input_key	any	The original input data.
CodeAutoGeneratedMarkerCount	int	The number of auto-generation markers detected in the code sample.
CodeAutoGeneratedScore	float	A score from 0.0 to 1.0, where 1.0 indicates the code is likely not auto-generated, and 0.0 indicates it is.

Example Input:

{
    "code_lines": [
        "# This file is automatically generated by the build system.",
        "def some_function():",
        "    return True"
    ]
}

Example Output:

{
    "code_lines": [
        "# This file is automatically generated by the build system.",
        "def some_function():",
        "    return True"
    ],
    "CodeAutoGeneratedMarkerCount": 1,
    "CodeAutoGeneratedScore": 0.0
}

eval

generate

eval

generate

eval

filter

generate

eval

filter

generate

generate

eval

filter

refine

generate

generate

generate

eval

filter

refine

generate

generate

eval

filter

generate

eval

filter

generate

eval

generate

filter

eval

filter

generate

refine

CodeAutoGeneratedSampleEvaluator

📘 Overview

__init__ function

Prompt Template Descriptions

run function

🧠 Example Usage

🧾 Default Output Format (Output Format)

`init` function

`run` function