Module ktrain.tabular.causalinference

Expand source code
def causal_inference_model(
    df,
    method="t-learner",
    metalearner_type=None,
    treatment_col="treatment",
    outcome_col="outcome",
    text_col=None,
    ignore_cols=[],
    include_cols=[],
    treatment_effect_col="treatment_effect",
    learner=None,
    effect_learner=None,
    min_df=0.05,
    max_df=0.5,
    ngram_range=(1, 1),
    stop_words="english",
    verbose=1,
):
    """
    ```
    Infers causality from the data contained in `df` using a metalearner.
    This function is a wrapper to the CausalNLP.CausalInferenceModel class.
    For more details on methods and capabilities of the returned `CausalInferenceModel` object,
    see the [CausalNLP documentation](https://amaiya.github.io/causalnlp/causalinference.html).

    Usage:
    >>> cm = causal_inference_model(df,
                                    treatment_col='Is_Male?',
                                    outcome_col='Post_Shared?', text_col='Post_Text',
                                    ignore_cols=['id', 'email'])
        cm.fit()

    **Parameters:**
    * **df** : pandas.DataFrame containing dataset
    * **method** : metalearner model to use. One of {'t-learner', 's-learner', 'x-learner', 'r-learner'} (Default: 't-learner')
    * **metalearner_type** : Alias of **method** parameter for backwards compatibility.  If not None, overrides method.
    * **treatment_col** : treatment variable; column should contain binary values: 1 for treated, 0 for untreated.
    * **outcome_col** : outcome variable; column should contain the categorical or numeric outcome values
    * **text_col** : (optional) text column containing the strings (e.g., articles, reviews, emails).
    * **ignore_cols** : columns to ignore in the analysis
    * **include_cols** : columns to include as covariates (e.g., possible confounders)
    * **treatment_effect_col** : name of column to hold causal effect estimations.  Does not need to exist.  Created by CausalNLP.
    * **learner** : an instance of a custom learner.  If None, a default LightGBM will be used.
        # Example
         learner = LGBMClassifier(num_leaves=1000)
    * **effect_learner**: used for x-learner/r-learner and must be regression model
    * **min_df** : min_df parameter used for text processing using sklearn
    * **max_df** : max_df parameter used for text procesing using sklearn
    * **ngram_range**: ngrams used for text vectorization. default: (1,1)
    * **stop_words** : stop words used for text processing (from sklearn)
    * **verbose** : If 1, print informational messages.  If 0, suppress.

    **Returns:**
    `CausalNLP.CausalInferenceModel` object
    ```
    """
    try:
        import causalnlp
    except ImportError:
        raise Exception("CausalNLP must be installed: pip install causalnlp")
    from causalnlp import CausalInferenceModel

    return CausalInferenceModel(
        df,
        method=method,
        metalearner_type=metalearner_type,
        treatment_col=treatment_col,
        outcome_col=outcome_col,
        text_col=text_col,
        ignore_cols=ignore_cols,
        include_cols=include_cols,
        treatment_effect_col=treatment_effect_col,
        learner=learner,
        effect_learner=effect_learner,
        min_df=min_df,
        max_df=max_df,
        ngram_range=ngram_range,
        stop_words=stop_words,
        verbose=verbose,
    )

Functions

def causal_inference_model(df, method='t-learner', metalearner_type=None, treatment_col='treatment', outcome_col='outcome', text_col=None, ignore_cols=[], include_cols=[], treatment_effect_col='treatment_effect', learner=None, effect_learner=None, min_df=0.05, max_df=0.5, ngram_range=(1, 1), stop_words='english', verbose=1)
Infers causality from the data contained in `df` using a metalearner.
This function is a wrapper to the CausalNLP.CausalInferenceModel class.
For more details on methods and capabilities of the returned `CausalInferenceModel` object,
see the [CausalNLP documentation](https://amaiya.github.io/causalnlp/causalinference.html).

Usage:
>>> cm = causal_inference_model(df,
                                treatment_col='Is_Male?',
                                outcome_col='Post_Shared?', text_col='Post_Text',
                                ignore_cols=['id', 'email'])
    cm.fit()

**Parameters:**
* **df** : pandas.DataFrame containing dataset
* **method** : metalearner model to use. One of {'t-learner', 's-learner', 'x-learner', 'r-learner'} (Default: 't-learner')
* **metalearner_type** : Alias of **method** parameter for backwards compatibility.  If not None, overrides method.
* **treatment_col** : treatment variable; column should contain binary values: 1 for treated, 0 for untreated.
* **outcome_col** : outcome variable; column should contain the categorical or numeric outcome values
* **text_col** : (optional) text column containing the strings (e.g., articles, reviews, emails).
* **ignore_cols** : columns to ignore in the analysis
* **include_cols** : columns to include as covariates (e.g., possible confounders)
* **treatment_effect_col** : name of column to hold causal effect estimations.  Does not need to exist.  Created by CausalNLP.
* **learner** : an instance of a custom learner.  If None, a default LightGBM will be used.
    # Example
     learner = LGBMClassifier(num_leaves=1000)
* **effect_learner**: used for x-learner/r-learner and must be regression model
* **min_df** : min_df parameter used for text processing using sklearn
* **max_df** : max_df parameter used for text procesing using sklearn
* **ngram_range**: ngrams used for text vectorization. default: (1,1)
* **stop_words** : stop words used for text processing (from sklearn)
* **verbose** : If 1, print informational messages.  If 0, suppress.

**Returns:**
`CausalNLP.CausalInferenceModel` object
Expand source code
def causal_inference_model(
    df,
    method="t-learner",
    metalearner_type=None,
    treatment_col="treatment",
    outcome_col="outcome",
    text_col=None,
    ignore_cols=[],
    include_cols=[],
    treatment_effect_col="treatment_effect",
    learner=None,
    effect_learner=None,
    min_df=0.05,
    max_df=0.5,
    ngram_range=(1, 1),
    stop_words="english",
    verbose=1,
):
    """
    ```
    Infers causality from the data contained in `df` using a metalearner.
    This function is a wrapper to the CausalNLP.CausalInferenceModel class.
    For more details on methods and capabilities of the returned `CausalInferenceModel` object,
    see the [CausalNLP documentation](https://amaiya.github.io/causalnlp/causalinference.html).

    Usage:
    >>> cm = causal_inference_model(df,
                                    treatment_col='Is_Male?',
                                    outcome_col='Post_Shared?', text_col='Post_Text',
                                    ignore_cols=['id', 'email'])
        cm.fit()

    **Parameters:**
    * **df** : pandas.DataFrame containing dataset
    * **method** : metalearner model to use. One of {'t-learner', 's-learner', 'x-learner', 'r-learner'} (Default: 't-learner')
    * **metalearner_type** : Alias of **method** parameter for backwards compatibility.  If not None, overrides method.
    * **treatment_col** : treatment variable; column should contain binary values: 1 for treated, 0 for untreated.
    * **outcome_col** : outcome variable; column should contain the categorical or numeric outcome values
    * **text_col** : (optional) text column containing the strings (e.g., articles, reviews, emails).
    * **ignore_cols** : columns to ignore in the analysis
    * **include_cols** : columns to include as covariates (e.g., possible confounders)
    * **treatment_effect_col** : name of column to hold causal effect estimations.  Does not need to exist.  Created by CausalNLP.
    * **learner** : an instance of a custom learner.  If None, a default LightGBM will be used.
        # Example
         learner = LGBMClassifier(num_leaves=1000)
    * **effect_learner**: used for x-learner/r-learner and must be regression model
    * **min_df** : min_df parameter used for text processing using sklearn
    * **max_df** : max_df parameter used for text procesing using sklearn
    * **ngram_range**: ngrams used for text vectorization. default: (1,1)
    * **stop_words** : stop words used for text processing (from sklearn)
    * **verbose** : If 1, print informational messages.  If 0, suppress.

    **Returns:**
    `CausalNLP.CausalInferenceModel` object
    ```
    """
    try:
        import causalnlp
    except ImportError:
        raise Exception("CausalNLP must be installed: pip install causalnlp")
    from causalnlp import CausalInferenceModel

    return CausalInferenceModel(
        df,
        method=method,
        metalearner_type=metalearner_type,
        treatment_col=treatment_col,
        outcome_col=outcome_col,
        text_col=text_col,
        ignore_cols=ignore_cols,
        include_cols=include_cols,
        treatment_effect_col=treatment_effect_col,
        learner=learner,
        effect_learner=effect_learner,
        min_df=min_df,
        max_df=max_df,
        ngram_range=ngram_range,
        stop_words=stop_words,
        verbose=verbose,
    )