PANDASAI

This Section of API covers the BaseModule Implementation along with some Package Constants and Exceptions.

Main

The init of pandasai module contains the a high level wrapper to run the package.

pandasai

PandasAI is a wrapper around a LLM to make dataframes conversational

This module includes the implementation of basis PandasAI class with methods to run the LLMs models on Pandas dataframes. Following LLMs are implemented so far.

Example

This module is the Entry point of the pandasai package. Following is an example of how to use this Class.

import pandas as pd
from pandasai import PandasAI

# Sample DataFrame
df = pd.DataFrame({
    "country": ["United States", "United Kingdom", "France", "Germany", "Italy", "Spain",
    "Canada", "Australia", "Japan", "China"],
    "gdp": [19294482071552, 2891615567872, 2411255037952, 3435817336832, 1745433788416,
    1181205135360, 1607402389504, 1490967855104, 4380756541440, 14631844184064],
    "happiness_index": [6.94, 7.16, 6.66, 7.07, 6.38, 6.4, 7.23, 7.22, 5.87, 5.12]
})

# Instantiate a LLM
from pandasai.llm.openai import OpenAI
llm = OpenAI(api_token="YOUR_API_TOKEN")

pandas_ai = PandasAI(llm, conversational=False)
pandas_ai(df, prompt='Which are the 5 happiest countries?')

PandasAI

PandasAI is a wrapper around a LLM to make dataframes conversational.

This is a an entry point of pandasai object. This class consists of methods to interface the LLMs with Pandas dataframes. A pandas dataframe metadata i.e df.head() and prompt is passed on to chosen LLMs API end point to generate a Python code to answer the questions asked. The resultant python code is run on actual data and answer is converted into a conversational form.

Note

Do not include the self parameter in the Args section.

Parameters:

Name Type Description Default
_llm obj

LLMs option to be used for API access

required
_verbose bool

To show the intermediate outputs e.g python code

required
_is_conversational_answer bool

Whether to return answer in conversational

required
_enforce_privacy bool

Do not display the data on prompt in case of

required
_max_retries int

max no. of tries to generate code on failure. Default to 3

required
_is_notebook bool

Whether to run code in notebook. Default to False

required
_original_instructions dict

The dict of instruction to run. Default to None

required
last_code_generated str

Pass last Code if generated. Default to None

required
last_run_code str

Pass the last execution / run. Default to None

required
code_output str

The code output if any. Default to None

required
last_error str

Error of running code last time. Default to None

required

Returns:

Name Type Description
response str

Returns the Response to a Question related to Data

Source code in pandasai/__init__.py
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
class PandasAI:
    """PandasAI is a wrapper around a LLM to make dataframes conversational.


    This is a an entry point of `pandasai` object. This class consists of methods to interface the
    LLMs with Pandas     dataframes. A pandas dataframe metadata i.e df.head() and prompt is
    passed on to chosen LLMs API end point to     generate a Python code to answer the questions
    asked. The resultant python code is run on actual data and answer is converted into a
    conversational form.

    Note:
        Do not include the `self` parameter in the ``Args`` section.
    Args:
        _llm (obj): LLMs option to be used for API access
        _verbose (bool, optional): To show the intermediate outputs e.g python code
        generated and execution step on the prompt. Default to False
        _is_conversational_answer (bool, optional): Whether to return answer in conversational
        form. Default to False
        _enforce_privacy (bool, optional): Do not display the data on prompt in case of
        Sensitive data. Default to False
        _max_retries (int, optional): max no. of tries to generate code on failure. Default to 3
        _is_notebook (bool, optional): Whether to run code in notebook. Default to False
        _original_instructions (dict, optional): The dict of instruction to run. Default to None
        last_code_generated (str, optional): Pass last Code if generated. Default to None
        last_run_code (str, optional): Pass the last execution / run. Default to None
        code_output (str, optional): The code output if any. Default to None
        last_error (str, optional): Error of running code last time. Default to None


    Returns:
        response (str): Returns the Response to a Question related to Data

    """

    _llm: LLM
    _verbose: bool = False
    _is_conversational_answer: bool = True
    _enforce_privacy: bool = False
    _max_retries: int = 3
    _is_notebook: bool = False
    _original_instructions: dict = {
        "question": None,
        "df_head": None,
        "num_rows": None,
        "num_columns": None,
        "rows_to_display": None,
    }
    last_code_generated: Optional[str] = None
    last_run_code: Optional[str] = None
    code_output: Optional[str] = None
    last_error: Optional[str] = None



    def __init__(
        self,
        llm=None,
        conversational=True,
        verbose=False,
        enforce_privacy=False,
    ):
        """

        __init__ method of the Class PandasAI

        Args:
            llm (object): LLMs option to be used for API access. Default is None
            conversational (bool): Whether to return answer in conversational form. Default to True
            verbose (bool): To show the intermediate outputs e.g python code generated and
             execution step on the prompt.  Default to False
            enforce_privacy (bool): Execute the codes with Privacy Mode ON.  Default to False
        """
        if llm is None:
            raise LLMNotFoundError(
                "An LLM should be provided to instantiate a PandasAI instance"
            )
        self._llm = llm
        self._is_conversational_answer = conversational
        self._verbose = verbose
        self._enforce_privacy = enforce_privacy

        self.notebook = Notebook()
        self._in_notebook = self.notebook.in_notebook()

    def conversational_answer(self, question: str, answer: str) -> str:

        """Returns the answer in conversational form about the resultant data.

        Args:
            question (str): A question in Conversational form
            answer (str): A summary / resultant Data

        Returns (str): Response

        """

        if self._enforce_privacy:
            # we don't want to send potentially sensitive data to the LLM server
            # if the user has set enforce_privacy to True
            return answer

        instruction = GenerateResponsePrompt(question=question, answer=answer)
        return self._llm.call(instruction, "")

    def run(
        self,
        data_frame: pd.DataFrame,
        prompt: str,
        is_conversational_answer: bool = None,
        show_code: bool = False,
        anonymize_df: bool = True,
        use_error_correction_framework: bool = True,
    ) -> str:
        """
        Run the PandasAI to make Dataframes Conversational.

        Args:
            data_frame (pd.Dataframe): A pandas Dataframe
            prompt (str): A prompt to query about the Dataframe
            is_conversational_answer (bool): Whether to return answer in conversational form.
            Default to False
            show_code (bool): To show the intermediate python code generated on the prompt.
            Default to False
            anonymize_df (bool): Running the code with Sensitive Data. Default to True
            use_error_correction_framework (bool): Turn on Error Correction mechanism.
            Default to True

        Returns: Answer to the Input Questions about the DataFrame

        """

        self.log(f"Running PandasAI with {self._llm.type} LLM...")

        try:
            rows_to_display = 0 if self._enforce_privacy else 5

            multiple: bool = isinstance(data_frame, list)

            if multiple:

                heads = [anonymize_dataframe_head(dataframe)
                        if anonymize_df
                        else dataframe.head(rows_to_display)
                        for dataframe in data_frame]

                code = self._llm.generate_code(
                    MultipleDataframesPrompt(dataframes=heads),
                    prompt,
                )

                self._original_instructions = {
                    "question": prompt,
                    "df_head": heads,
                    "rows_to_display": rows_to_display,
                }

            else:

                df_head = data_frame.head(rows_to_display)
                if anonymize_df:
                    df_head = anonymize_dataframe_head(df_head)

                code = self._llm.generate_code(
                    GeneratePythonCodePrompt(
                        prompt=prompt,
                        df_head=df_head,
                        num_rows=data_frame.shape[0],
                        num_columns=data_frame.shape[1],
                        rows_to_display=rows_to_display,
                    ),
                    prompt,
                )

                self._original_instructions = {
                    "question": prompt,
                    "df_head": df_head,
                    "num_rows": data_frame.shape[0],
                    "num_columns": data_frame.shape[1],
                    "rows_to_display": rows_to_display,
                }

            self.last_code_generated = code
            self.log(
                f"""
                    Code generated:
                    ```
                    {code}
                    ```
                """
            )
            if show_code and self._in_notebook:
                self.notebook.create_new_cell(code)

            answer = self.run_code(
                code,
                data_frame,
                use_error_correction_framework=use_error_correction_framework,
            )
            self.code_output = answer
            self.log(f"Answer: {answer}")

            if is_conversational_answer is None:
                is_conversational_answer = self._is_conversational_answer
            if is_conversational_answer:
                answer = self.conversational_answer(prompt, answer)
                self.log(f"Conversational answer: {answer}")
            return answer
        except Exception as exception:  # pylint: disable=broad-except
            self.last_error = str(exception)
            return (
                "Unfortunately, I was not able to answer your question, "
                "because of the following error:\n"
                f"\n{exception}\n"
            )

    def __call__(
        self,
        data_frame: pd.DataFrame,
        prompt: str,
        is_conversational_answer: bool = None,
        show_code: bool = False,
        anonymize_df: bool = True,
        use_error_correction_framework: bool = True,
    ) -> str:
        """
        __call__ method of PandasAI class. It call `run` method
        Args:
            data_frame:
            prompt:
            is_conversational_answer:
            show_code:
            anonymize_df:
            use_error_correction_framework:

        Returns:

        """

        return self.run(
            data_frame,
            prompt,
            is_conversational_answer,
            show_code,
            anonymize_df,
            use_error_correction_framework,
        )

    def is_unsafe_import(self, node: ast.stmt) -> bool:

        """Remove non-whitelisted imports from the code to prevent malicious code execution

        Args:
            node (object): ast.stmt

        Returns (bool): A flag if unsafe_imports found.

        """

        return isinstance(node, (ast.Import, ast.ImportFrom)) and any(
            alias.name not in WHITELISTED_LIBRARIES for alias in node.names
        )

    def is_df_overwrite(self, node: ast.stmt) -> str:

        """
        Remove df declarations from the code to prevent malicious code execution. A helper method.
        Args:
            node (object): ast.stmt

        Returns (str):

        """

        return (
            isinstance(node, ast.Assign)
            and isinstance(node.targets[0], ast.Name)
            and re.match(r"df\d{0,2}$", node.targets[0].id)
        )

    def clean_code(self, code: str) -> str:

        """
        A method to clean the code to prevent malicious code execution
        Args:
            code(str): A python code

        Returns (str): Returns a Clean Code String

        """

        tree = ast.parse(code)

        new_body = [
            node
            for node in tree.body
            if not (self.is_unsafe_import(node) or self.is_df_overwrite(node))
        ]

        new_tree = ast.Module(body=new_body)
        return astor.to_source(new_tree).strip()

    def run_code(
        self,
        code: str,
        data_frame: pd.DataFrame,
        use_error_correction_framework: bool = True,
    ) -> str:
        """
        A method to execute the python code generated by LLMs to answer the question about the
        input dataframe. Run the code in the current context and return the result.
        Args:
            code (str): A python code to execute
            data_frame (pd.DataFrame): A full Pandas DataFrame
            use_error_correction_framework (bool): Turn on Error Correction mechanism.
            Default to True

        Returns:

        """

        # pylint: disable=W0122 disable=W0123 disable=W0702:bare-except

        multiple: bool = isinstance(data_frame, list)
        # Get the code to run removing unsafe imports and df overwrites
        code_to_run = self.clean_code(code)
        self.last_run_code = code_to_run
        self.log(
            f"""
Code running:
```
{code_to_run}
```"""
        )

        environment: dict = {
            "pd": pd,
            "plt": plt,
            "__builtins__": {
                **{
                    builtin: __builtins__[builtin]
                    for builtin in WHITELISTED_BUILTINS
                },
            },
        }

        if multiple:
            environment.update({
                f"df{i}": dataframe for i, dataframe in enumerate(data_frame, start = 1)
            })

        else:
            environment["df"] = data_frame

        # Redirect standard output to a StringIO buffer
        with redirect_stdout(io.StringIO()) as output:
            count = 0
            while count < self._max_retries:
                try:
                    # Execute the code
                    exec(code_to_run, environment)
                    code = code_to_run
                    break
                except Exception as e:  # pylint: disable=W0718 disable=C0103
                    if not use_error_correction_framework:
                        raise e

                    count += 1

                    if multiple:
                        error_correcting_instruction = CorrectMultipleDataframesErrorPrompt(
                            code=code,
                            error_returned=e,
                            question=self._original_instructions["question"],
                            df_head=self._original_instructions["df_head"],
                        )

                    else:
                        error_correcting_instruction = CorrectErrorPrompt(
                            code=code,
                            error_returned=e,
                            question=self._original_instructions["question"],
                            df_head=self._original_instructions["df_head"],
                            num_rows=self._original_instructions["num_rows"],
                            num_columns=self._original_instructions["num_columns"],
                            rows_to_display=self._original_instructions["rows_to_display"],
                        )

                    code_to_run = self._llm.generate_code(
                        error_correcting_instruction, ""
                    )

        captured_output = output.getvalue()

        # Evaluate the last line and return its value or the captured output
        lines = code.strip().split("\n")
        last_line = lines[-1].strip()

        match = re.match(r"^print\((.*)\)$", last_line)
        if match:
            last_line = match.group(1)

        try:
            return eval(last_line, environment)
        except Exception:  # pylint: disable=W0718
            return captured_output

    def log(self, message: str):
        """Log a message"""
        if self._verbose:
            print(message)

__call__(data_frame, prompt, is_conversational_answer=None, show_code=False, anonymize_df=True, use_error_correction_framework=True)

call method of PandasAI class. It call run method

Parameters:

Name Type Description Default
data_frame pd.DataFrame required
prompt str required
is_conversational_answer bool None
show_code bool False
anonymize_df bool True
use_error_correction_framework bool True
Source code in pandasai/__init__.py
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
def __call__(
    self,
    data_frame: pd.DataFrame,
    prompt: str,
    is_conversational_answer: bool = None,
    show_code: bool = False,
    anonymize_df: bool = True,
    use_error_correction_framework: bool = True,
) -> str:
    """
    __call__ method of PandasAI class. It call `run` method
    Args:
        data_frame:
        prompt:
        is_conversational_answer:
        show_code:
        anonymize_df:
        use_error_correction_framework:

    Returns:

    """

    return self.run(
        data_frame,
        prompt,
        is_conversational_answer,
        show_code,
        anonymize_df,
        use_error_correction_framework,
    )

__init__(llm=None, conversational=True, verbose=False, enforce_privacy=False)

init method of the Class PandasAI

Parameters:

Name Type Description Default
llm object

LLMs option to be used for API access. Default is None

None
conversational bool

Whether to return answer in conversational form. Default to True

True
verbose bool

To show the intermediate outputs e.g python code generated and execution step on the prompt. Default to False

False
enforce_privacy bool

Execute the codes with Privacy Mode ON. Default to False

False
Source code in pandasai/__init__.py
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
def __init__(
    self,
    llm=None,
    conversational=True,
    verbose=False,
    enforce_privacy=False,
):
    """

    __init__ method of the Class PandasAI

    Args:
        llm (object): LLMs option to be used for API access. Default is None
        conversational (bool): Whether to return answer in conversational form. Default to True
        verbose (bool): To show the intermediate outputs e.g python code generated and
         execution step on the prompt.  Default to False
        enforce_privacy (bool): Execute the codes with Privacy Mode ON.  Default to False
    """
    if llm is None:
        raise LLMNotFoundError(
            "An LLM should be provided to instantiate a PandasAI instance"
        )
    self._llm = llm
    self._is_conversational_answer = conversational
    self._verbose = verbose
    self._enforce_privacy = enforce_privacy

    self.notebook = Notebook()
    self._in_notebook = self.notebook.in_notebook()

clean_code(code)

A method to clean the code to prevent malicious code execution

Parameters:

Name Type Description Default
code(str)

A python code

required

Returns (str): Returns a Clean Code String

Source code in pandasai/__init__.py
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
def clean_code(self, code: str) -> str:

    """
    A method to clean the code to prevent malicious code execution
    Args:
        code(str): A python code

    Returns (str): Returns a Clean Code String

    """

    tree = ast.parse(code)

    new_body = [
        node
        for node in tree.body
        if not (self.is_unsafe_import(node) or self.is_df_overwrite(node))
    ]

    new_tree = ast.Module(body=new_body)
    return astor.to_source(new_tree).strip()

conversational_answer(question, answer)

Returns the answer in conversational form about the resultant data.

Parameters:

Name Type Description Default
question str

A question in Conversational form

required
answer str

A summary / resultant Data

required

Returns (str): Response

Source code in pandasai/__init__.py
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
def conversational_answer(self, question: str, answer: str) -> str:

    """Returns the answer in conversational form about the resultant data.

    Args:
        question (str): A question in Conversational form
        answer (str): A summary / resultant Data

    Returns (str): Response

    """

    if self._enforce_privacy:
        # we don't want to send potentially sensitive data to the LLM server
        # if the user has set enforce_privacy to True
        return answer

    instruction = GenerateResponsePrompt(question=question, answer=answer)
    return self._llm.call(instruction, "")

is_df_overwrite(node)

Remove df declarations from the code to prevent malicious code execution. A helper method.

Parameters:

Name Type Description Default
node object

ast.stmt

required

Returns (str):

Source code in pandasai/__init__.py
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
def is_df_overwrite(self, node: ast.stmt) -> str:

    """
    Remove df declarations from the code to prevent malicious code execution. A helper method.
    Args:
        node (object): ast.stmt

    Returns (str):

    """

    return (
        isinstance(node, ast.Assign)
        and isinstance(node.targets[0], ast.Name)
        and re.match(r"df\d{0,2}$", node.targets[0].id)
    )

is_unsafe_import(node)

Remove non-whitelisted imports from the code to prevent malicious code execution

Parameters:

Name Type Description Default
node object

ast.stmt

required

Returns (bool): A flag if unsafe_imports found.

Source code in pandasai/__init__.py
303
304
305
306
307
308
309
310
311
312
313
314
315
316
def is_unsafe_import(self, node: ast.stmt) -> bool:

    """Remove non-whitelisted imports from the code to prevent malicious code execution

    Args:
        node (object): ast.stmt

    Returns (bool): A flag if unsafe_imports found.

    """

    return isinstance(node, (ast.Import, ast.ImportFrom)) and any(
        alias.name not in WHITELISTED_LIBRARIES for alias in node.names
    )

log(message)

Log a message

Source code in pandasai/__init__.py
462
463
464
465
def log(self, message: str):
    """Log a message"""
    if self._verbose:
        print(message)

run(data_frame, prompt, is_conversational_answer=None, show_code=False, anonymize_df=True, use_error_correction_framework=True)

Run the PandasAI to make Dataframes Conversational.

Parameters:

Name Type Description Default
data_frame pd.Dataframe

A pandas Dataframe

required
prompt str

A prompt to query about the Dataframe

required
is_conversational_answer bool

Whether to return answer in conversational form.

None
show_code bool

To show the intermediate python code generated on the prompt.

False
anonymize_df bool

Running the code with Sensitive Data. Default to True

True
use_error_correction_framework bool

Turn on Error Correction mechanism.

True
Source code in pandasai/__init__.py
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
def run(
    self,
    data_frame: pd.DataFrame,
    prompt: str,
    is_conversational_answer: bool = None,
    show_code: bool = False,
    anonymize_df: bool = True,
    use_error_correction_framework: bool = True,
) -> str:
    """
    Run the PandasAI to make Dataframes Conversational.

    Args:
        data_frame (pd.Dataframe): A pandas Dataframe
        prompt (str): A prompt to query about the Dataframe
        is_conversational_answer (bool): Whether to return answer in conversational form.
        Default to False
        show_code (bool): To show the intermediate python code generated on the prompt.
        Default to False
        anonymize_df (bool): Running the code with Sensitive Data. Default to True
        use_error_correction_framework (bool): Turn on Error Correction mechanism.
        Default to True

    Returns: Answer to the Input Questions about the DataFrame

    """

    self.log(f"Running PandasAI with {self._llm.type} LLM...")

    try:
        rows_to_display = 0 if self._enforce_privacy else 5

        multiple: bool = isinstance(data_frame, list)

        if multiple:

            heads = [anonymize_dataframe_head(dataframe)
                    if anonymize_df
                    else dataframe.head(rows_to_display)
                    for dataframe in data_frame]

            code = self._llm.generate_code(
                MultipleDataframesPrompt(dataframes=heads),
                prompt,
            )

            self._original_instructions = {
                "question": prompt,
                "df_head": heads,
                "rows_to_display": rows_to_display,
            }

        else:

            df_head = data_frame.head(rows_to_display)
            if anonymize_df:
                df_head = anonymize_dataframe_head(df_head)

            code = self._llm.generate_code(
                GeneratePythonCodePrompt(
                    prompt=prompt,
                    df_head=df_head,
                    num_rows=data_frame.shape[0],
                    num_columns=data_frame.shape[1],
                    rows_to_display=rows_to_display,
                ),
                prompt,
            )

            self._original_instructions = {
                "question": prompt,
                "df_head": df_head,
                "num_rows": data_frame.shape[0],
                "num_columns": data_frame.shape[1],
                "rows_to_display": rows_to_display,
            }

        self.last_code_generated = code
        self.log(
            f"""
                Code generated:
                ```
                {code}
                ```
            """
        )
        if show_code and self._in_notebook:
            self.notebook.create_new_cell(code)

        answer = self.run_code(
            code,
            data_frame,
            use_error_correction_framework=use_error_correction_framework,
        )
        self.code_output = answer
        self.log(f"Answer: {answer}")

        if is_conversational_answer is None:
            is_conversational_answer = self._is_conversational_answer
        if is_conversational_answer:
            answer = self.conversational_answer(prompt, answer)
            self.log(f"Conversational answer: {answer}")
        return answer
    except Exception as exception:  # pylint: disable=broad-except
        self.last_error = str(exception)
        return (
            "Unfortunately, I was not able to answer your question, "
            "because of the following error:\n"
            f"\n{exception}\n"
        )

run_code(code, data_frame, use_error_correction_framework=True)

A method to execute the python code generated by LLMs to answer the question about the input dataframe. Run the code in the current context and return the result.

Parameters:

Name Type Description Default
code str

A python code to execute

required
data_frame pd.DataFrame

A full Pandas DataFrame

required
use_error_correction_framework bool

Turn on Error Correction mechanism.

True
Source code in pandasai/__init__.py
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
    def run_code(
        self,
        code: str,
        data_frame: pd.DataFrame,
        use_error_correction_framework: bool = True,
    ) -> str:
        """
        A method to execute the python code generated by LLMs to answer the question about the
        input dataframe. Run the code in the current context and return the result.
        Args:
            code (str): A python code to execute
            data_frame (pd.DataFrame): A full Pandas DataFrame
            use_error_correction_framework (bool): Turn on Error Correction mechanism.
            Default to True

        Returns:

        """

        # pylint: disable=W0122 disable=W0123 disable=W0702:bare-except

        multiple: bool = isinstance(data_frame, list)
        # Get the code to run removing unsafe imports and df overwrites
        code_to_run = self.clean_code(code)
        self.last_run_code = code_to_run
        self.log(
            f"""
Code running:
```
{code_to_run}
```"""
        )

        environment: dict = {
            "pd": pd,
            "plt": plt,
            "__builtins__": {
                **{
                    builtin: __builtins__[builtin]
                    for builtin in WHITELISTED_BUILTINS
                },
            },
        }

        if multiple:
            environment.update({
                f"df{i}": dataframe for i, dataframe in enumerate(data_frame, start = 1)
            })

        else:
            environment["df"] = data_frame

        # Redirect standard output to a StringIO buffer
        with redirect_stdout(io.StringIO()) as output:
            count = 0
            while count < self._max_retries:
                try:
                    # Execute the code
                    exec(code_to_run, environment)
                    code = code_to_run
                    break
                except Exception as e:  # pylint: disable=W0718 disable=C0103
                    if not use_error_correction_framework:
                        raise e

                    count += 1

                    if multiple:
                        error_correcting_instruction = CorrectMultipleDataframesErrorPrompt(
                            code=code,
                            error_returned=e,
                            question=self._original_instructions["question"],
                            df_head=self._original_instructions["df_head"],
                        )

                    else:
                        error_correcting_instruction = CorrectErrorPrompt(
                            code=code,
                            error_returned=e,
                            question=self._original_instructions["question"],
                            df_head=self._original_instructions["df_head"],
                            num_rows=self._original_instructions["num_rows"],
                            num_columns=self._original_instructions["num_columns"],
                            rows_to_display=self._original_instructions["rows_to_display"],
                        )

                    code_to_run = self._llm.generate_code(
                        error_correcting_instruction, ""
                    )

        captured_output = output.getvalue()

        # Evaluate the last line and return its value or the captured output
        lines = code.strip().split("\n")
        last_line = lines[-1].strip()

        match = re.match(r"^print\((.*)\)$", last_line)
        if match:
            last_line = match.group(1)

        try:
            return eval(last_line, environment)
        except Exception:  # pylint: disable=W0718
            return captured_output

Constants

Some of the package level constants are defined here.

pandasai.constants

Constants used in the pandasai package.

It includes Start & End Code tags, Whitelisted Python Packages and While List Builtin Methods.

Exception Handling

The pandasai specific Exception handling mechanism defined here.

pandasai.exceptions

PandasAI's custom exceptions.

This module contains the implementation of Custom Exceptions.

APIKeyNotFoundError

Bases: Exception

Raised when the API key is not defined/declared.

Parameters:

Name Type Description Default
Exception Exception

APIKeyNotFoundError

required
Source code in pandasai/exceptions.py
 8
 9
10
11
12
13
14
15
class APIKeyNotFoundError(Exception):

    """
    Raised when the API key is not defined/declared.

    Args:
        Exception (Exception): APIKeyNotFoundError
    """

LLMNotFoundError

Bases: Exception

Raised when the LLM is not provided.

Parameters:

Name Type Description Default
Exception Exception

LLMNotFoundError

required
Source code in pandasai/exceptions.py
18
19
20
21
22
23
24
class LLMNotFoundError(Exception):
    """
    Raised when the LLM is not provided.

    Args:
        Exception (Exception): LLMNotFoundError
    """

MethodNotImplementedError

Bases: Exception

Raised when a method is not implemented.

Parameters:

Name Type Description Default
Exception Exception

MethodNotImplementedError

required
Source code in pandasai/exceptions.py
36
37
38
39
40
41
42
class MethodNotImplementedError(Exception):
    """
    Raised when a method is not implemented.

    Args:
        Exception (Exception): MethodNotImplementedError
    """

NoCodeFoundError

Bases: Exception

Raised when no code is found in the response.

Parameters:

Name Type Description Default
Exception Exception

NoCodeFoundError

required
Source code in pandasai/exceptions.py
27
28
29
30
31
32
33
class NoCodeFoundError(Exception):
    """
    Raised when no code is found in the response.

    Args:
        Exception (Exception): NoCodeFoundError
    """

UnsupportedOpenAIModelError

Bases: Exception

Raised when an unsupported OpenAI model is used.

Parameters:

Name Type Description Default
Exception Exception

UnsupportedOpenAIModelError

required
Source code in pandasai/exceptions.py
45
46
47
48
49
50
51
class UnsupportedOpenAIModelError(Exception):
    """
    Raised when an unsupported OpenAI model is used.

    Args:
        Exception (Exception): UnsupportedOpenAIModelError
    """