Auto Coder

Automatically codes text fields such as open-ended survey questions based on lingustic properties such as topic and sentiment.

Autocoder

 Autocoder (verbose=1, device=None)

Autocodes text fields


Autocoder.code_sentiment

 Autocoder.code_sentiment (docs, df, batch_size=8, binarize=False,
                           threshold=0.5)

Autocodes text for positive or negative sentiment

Let’s prepare a toy dataset:

ac = Autocoder()
reviews = ["I loved this doctor!", "This doctor was absolutely terrible."]
df = pd.DataFrame({
    'gender': ['female', 'male'],
     'review' : reviews,
      })
df.head()
gender review
0 female I loved this doctor!
1 male This doctor was absolutely terrible.

After autocoding for sentiment, the dataframe now has extra columns:

result_df = ac.code_sentiment(df['review'].values, df)
result_df.head()
gender review negative positive
0 female I loved this doctor! 0.005034 0.994966
1 male This doctor was absolutely terrible. 0.981789 0.018211
assert result_df[result_df['gender']=='female']['negative'].values[0] < 0.1
assert result_df[result_df['gender']=='female']['positive'].values[0] > 0.9
assert result_df[result_df['gender']=='male']['negative'].values[0] > 0.9
assert result_df[result_df['gender']=='male']['positive'].values[0] < 0.1

Autocoder.code_custom_topics

 Autocoder.code_custom_topics (docs, df, labels, batch_size=8,
                               binarize=False, threshold=0.5)

Autocodes text for user-specified topics. The label field is the name of the topic as a string (or a list of them.)

Let’s prepare a toy dataset:

comments = ["What is your favorite sitcom of all time?", 'I cannot wait to vote!']
df = pd.DataFrame({
    'over_18': ['yes', 'no'],
     'comments' : comments,
      })
df.head()
over_18 comments
0 yes What is your favorite sitcom of all time?
1 no I cannot wait to vote!

After autocoding, the dataframe has a new column for each custom topic:

result_df = ac.code_custom_topics(df['comments'].values, df, labels=['television', 'film', 'politics'])
result_df.head()
over_18 comments television film politics
0 yes What is your favorite sitcom of all time? 0.981327 0.012260 0.000157
1 no I cannot wait to vote! 0.000518 0.004943 0.936988
assert result_df[result_df['over_18']=='yes']['television'].values[0] > 0.9
assert result_df[result_df['over_18']=='yes']['film'].values[0] < 0.1
assert result_df[result_df['over_18']=='yes']['politics'].values[0] < 0.1
assert result_df[result_df['over_18']=='no']['television'].values[0] < 0.1
assert result_df[result_df['over_18']=='no']['film'].values[0] < 0.1
assert result_df[result_df['over_18']=='no']['politics'].values[0] > 0.9

Autocoder.code_emotion

 Autocoder.code_emotion (docs, df, batch_size=8, binarize=False,
                         threshold=0.5)

Autocodes text for emotion

comments = ["I'm nervous about tomorrow.", 'I got a promotion at work!',
            "My best friend was in a car accident.", "I hate it when I'm cut off in traffic."]
df = pd.DataFrame({
    'over_18': ['yes', 'no', 'yes', 'yes'],
     'comments' : comments,
      })
df.head()
over_18 comments
0 yes I'm nervous about tomorrow.
1 no I got a promotion at work!
2 yes My best friend was in a car accident.
3 yes I hate it when I'm cut off in traffic.
result_df = ac.code_emotion(df['comments'].values, df, binarize=True)
result_df.head()
over_18 comments joy anger fear sadness
0 yes I'm nervous about tomorrow. 0 0 1 0
1 no I got a promotion at work! 1 0 0 0
2 yes My best friend was in a car accident. 0 0 0 1
3 yes I hate it when I'm cut off in traffic. 0 1 0 0
assert result_df.iloc[0]['fear'] == 1
assert result_df.iloc[1]['joy'] == 1
assert result_df.iloc[2]['sadness'] == 1
assert result_df.iloc[3]['anger'] == 1

Autocoder.code_transformer

 Autocoder.code_transformer (docs, df, batch_size=32, model_name='stsb-
                             roberta-large', show_progress_bar=False)

Encode texts as semantically meaningful vectors using a Transformer model

reviews = ["I loved this doctor!", "This doctor was absolutely terrible."]
df = pd.DataFrame({
    'gender': ['female', 'male'],
     'review' : reviews,
      })
df.head()
gender review
0 female I loved this doctor!
1 male This doctor was absolutely terrible.
df = ac.code_transformer(df.review.values, df)
df.head()
gender review e_0000 e_0001 e_0002 e_0003 e_0004 e_0005 e_0006 e_0007 e_0008 e_0009 e_0010 e_0011 e_0012 e_0013 e_0014 e_0015 e_0016 e_0017 e_0018 e_0019 e_0020 e_0021 e_0022 e_0023 e_0024 e_0025 e_0026 e_0027 e_0028 e_0029 e_0030 e_0031 e_0032 e_0033 e_0034 e_0035 e_0036 e_0037 e_0038 e_0039 e_0040 e_0041 e_0042 e_0043 e_0044 e_0045 e_0046 e_0047 e_0048 e_0049 e_0050 e_0051 e_0052 e_0053 e_0054 e_0055 e_0056 e_0057 e_0058 e_0059 e_0060 e_0061 e_0062 e_0063 e_0064 e_0065 e_0066 e_0067 e_0068 e_0069 e_0070 e_0071 e_0072 e_0073 e_0074 e_0075 e_0076 e_0077 e_0078 e_0079 e_0080 e_0081 e_0082 e_0083 e_0084 e_0085 e_0086 e_0087 e_0088 e_0089 e_0090 e_0091 e_0092 e_0093 e_0094 e_0095 e_0096 e_0097 e_0098 e_0099 e_0100 e_0101 e_0102 e_0103 e_0104 e_0105 e_0106 e_0107 e_0108 e_0109 e_0110 e_0111 e_0112 e_0113 e_0114 e_0115 e_0116 e_0117 e_0118 e_0119 e_0120 e_0121 e_0122 e_0123 e_0124 e_0125 e_0126 e_0127 e_0128 e_0129 e_0130 e_0131 e_0132 e_0133 e_0134 e_0135 e_0136 e_0137 e_0138 e_0139 e_0140 e_0141 e_0142 e_0143 e_0144 e_0145 e_0146 e_0147 e_0148 e_0149 e_0150 e_0151 e_0152 e_0153 e_0154 e_0155 e_0156 e_0157 e_0158 e_0159 e_0160 e_0161 e_0162 e_0163 e_0164 e_0165 e_0166 e_0167 e_0168 e_0169 e_0170 e_0171 e_0172 e_0173 e_0174 e_0175 e_0176 e_0177 e_0178 e_0179 e_0180 e_0181 e_0182 e_0183 e_0184 e_0185 e_0186 e_0187 e_0188 e_0189 e_0190 e_0191 e_0192 e_0193 e_0194 e_0195 e_0196 e_0197 e_0198 e_0199 e_0200 e_0201 e_0202 e_0203 e_0204 e_0205 e_0206 e_0207 e_0208 e_0209 e_0210 e_0211 e_0212 e_0213 e_0214 e_0215 e_0216 e_0217 e_0218 e_0219 e_0220 e_0221 e_0222 e_0223 e_0224 e_0225 e_0226 e_0227 e_0228 e_0229 e_0230 e_0231 e_0232 e_0233 e_0234 e_0235 e_0236 e_0237 e_0238 e_0239 e_0240 e_0241 e_0242 e_0243 e_0244 e_0245 e_0246 e_0247 ... e_0774 e_0775 e_0776 e_0777 e_0778 e_0779 e_0780 e_0781 e_0782 e_0783 e_0784 e_0785 e_0786 e_0787 e_0788 e_0789 e_0790 e_0791 e_0792 e_0793 e_0794 e_0795 e_0796 e_0797 e_0798 e_0799 e_0800 e_0801 e_0802 e_0803 e_0804 e_0805 e_0806 e_0807 e_0808 e_0809 e_0810 e_0811 e_0812 e_0813 e_0814 e_0815 e_0816 e_0817 e_0818 e_0819 e_0820 e_0821 e_0822 e_0823 e_0824 e_0825 e_0826 e_0827 e_0828 e_0829 e_0830 e_0831 e_0832 e_0833 e_0834 e_0835 e_0836 e_0837 e_0838 e_0839 e_0840 e_0841 e_0842 e_0843 e_0844 e_0845 e_0846 e_0847 e_0848 e_0849 e_0850 e_0851 e_0852 e_0853 e_0854 e_0855 e_0856 e_0857 e_0858 e_0859 e_0860 e_0861 e_0862 e_0863 e_0864 e_0865 e_0866 e_0867 e_0868 e_0869 e_0870 e_0871 e_0872 e_0873 e_0874 e_0875 e_0876 e_0877 e_0878 e_0879 e_0880 e_0881 e_0882 e_0883 e_0884 e_0885 e_0886 e_0887 e_0888 e_0889 e_0890 e_0891 e_0892 e_0893 e_0894 e_0895 e_0896 e_0897 e_0898 e_0899 e_0900 e_0901 e_0902 e_0903 e_0904 e_0905 e_0906 e_0907 e_0908 e_0909 e_0910 e_0911 e_0912 e_0913 e_0914 e_0915 e_0916 e_0917 e_0918 e_0919 e_0920 e_0921 e_0922 e_0923 e_0924 e_0925 e_0926 e_0927 e_0928 e_0929 e_0930 e_0931 e_0932 e_0933 e_0934 e_0935 e_0936 e_0937 e_0938 e_0939 e_0940 e_0941 e_0942 e_0943 e_0944 e_0945 e_0946 e_0947 e_0948 e_0949 e_0950 e_0951 e_0952 e_0953 e_0954 e_0955 e_0956 e_0957 e_0958 e_0959 e_0960 e_0961 e_0962 e_0963 e_0964 e_0965 e_0966 e_0967 e_0968 e_0969 e_0970 e_0971 e_0972 e_0973 e_0974 e_0975 e_0976 e_0977 e_0978 e_0979 e_0980 e_0981 e_0982 e_0983 e_0984 e_0985 e_0986 e_0987 e_0988 e_0989 e_0990 e_0991 e_0992 e_0993 e_0994 e_0995 e_0996 e_0997 e_0998 e_0999 e_1000 e_1001 e_1002 e_1003 e_1004 e_1005 e_1006 e_1007 e_1008 e_1009 e_1010 e_1011 e_1012 e_1013 e_1014 e_1015 e_1016 e_1017 e_1018 e_1019 e_1020 e_1021 e_1022 e_1023
0 female I loved this doctor! -0.601180 0.63924 -1.060369 -0.493731 -0.560602 -1.008939 -0.598373 -0.672984 -0.640708 0.035111 -0.394858 1.125174 -0.809709 0.092503 -1.561161 -0.338891 -0.980971 -0.218150 -0.770218 0.518710 -0.154179 -0.465516 -0.636096 0.136777 -0.671058 0.887398 1.150699 -0.255780 -0.124599 -1.695018 -0.176872 -0.554525 0.420271 1.104314 -0.662254 -1.104489 -0.150347 -0.328107 -0.265293 -0.232560 -0.732200 0.102850 1.920283 0.345062 0.727855 -0.558262 -0.727879 0.068228 -0.288561 -1.376904 0.480348 -0.951237 -0.184960 -0.977991 -0.494253 -0.142821 0.186124 0.165434 -0.054685 0.401775 -0.606251 -0.400374 0.273658 -0.347374 0.430464 0.691613 -0.515042 -0.089148 0.224055 -0.449324 -0.194018 0.594868 -0.614699 -0.372428 -0.152740 -0.066053 1.074708 -0.810008 0.675267 -0.609482 0.561731 -0.939348 -0.691044 -0.995085 0.166328 -1.531809 -0.379524 -0.498859 -0.741533 -0.413628 1.733109 -0.791183 -0.098716 -1.233320 0.137791 0.938823 0.544056 -1.024858 0.578154 -0.508842 -1.02344 0.597844 1.085202 -1.700814 -0.930899 0.512372 -1.246666 -0.310088 0.550669 -1.052261 0.829993 -0.637789 -0.438172 -0.568537 0.722001 -0.957278 -0.768909 -0.160705 1.836634 -0.581478 0.488976 0.347505 0.783655 0.589048 -0.770470 0.439723 -0.408767 0.295210 1.149268 0.160561 0.342767 -1.275258 -0.075461 0.347347 -1.197511 -1.346758 0.052439 -1.996378 0.061254 -0.809438 -0.636264 -0.521608 0.209667 1.201378 1.304153 0.858927 1.373043 0.723126 -0.444028 0.397904 -1.185390 0.309025 0.101139 0.790088 -0.622006 -0.557396 -1.449295 -0.310136 1.294056 0.667670 -1.077921 -0.054806 -0.571363 1.299069 -0.331780 -0.840043 1.282068 0.425646 1.468901 -0.662940 0.312070 1.420857 0.084983 0.438225 -0.310173 -0.981818 0.668648 -1.796632 -0.476523 0.171581 0.081280 -1.055869 0.731146 0.08277 0.402360 -0.111507 1.052607 0.101430 -0.436715 -0.689744 -0.359304 -0.849817 0.102386 -0.674699 -0.632387 0.635284 -0.454287 0.002086 -0.698927 -1.261298 0.795101 -0.073548 -0.325836 0.421854 -1.620993 1.901133 -0.371985 -1.075006 0.779400 -0.981727 1.718572 -0.156533 -1.501477 0.638843 -0.603821 -0.441458 -0.419934 1.299583 -0.329040 0.187053 1.476716 0.841890 1.378884 1.415993 0.490227 0.936830 -1.134727 -1.298774 -0.237284 -0.639338 -0.062777 -0.571428 -0.696612 -1.674280 0.200119 0.566758 1.258007 0.281263 -0.227386 0.403024 -0.913720 -0.332624 -1.145163 -1.373418 0.726469 -0.116224 -1.080072 1.629549 ... -0.597257 0.473388 -0.087903 -0.734512 -0.192177 1.098323 0.252798 -0.220381 0.970835 0.379641 0.702579 0.312840 -0.014865 0.076790 -0.926711 0.283459 -0.201209 -1.507545 1.013160 0.399853 -0.560346 -0.432460 0.738794 0.271019 0.758012 0.104948 0.032013 -1.118263 0.817342 -0.134956 -0.367429 -1.095512 1.424715 -0.45837 -1.005260 1.168613 -0.739624 -0.778042 -0.356735 0.470457 0.181307 0.867469 -0.033199 -0.059741 0.067899 -0.396584 1.678157 -0.886795 0.431772 0.239491 -0.398208 0.357573 -0.649485 0.884956 0.774564 -0.091967 0.539808 -0.098838 0.407468 0.022493 0.596556 -2.279631 -1.012585 -0.515414 1.008494 0.024449 0.786387 -0.039095 -0.282467 1.210615 0.009026 0.694994 -0.778204 -0.434734 -0.546120 0.111784 -0.414437 -0.186292 -0.924311 0.77127 -0.726940 -0.002944 -0.904095 -0.78010 -1.344393 0.419025 0.236579 -0.147507 0.422930 0.268999 -1.120624 -2.346339 0.059263 0.432407 -0.029169 0.342242 -0.227717 0.429899 -0.487459 0.215380 -1.755591 0.571806 1.145492 -0.595226 0.279368 -1.833523 -0.318555 -0.334241 1.546088 0.996180 0.365354 0.795756 0.931366 -1.328835 2.221819 0.533792 0.419647 0.607098 1.148281 0.962833 -0.627508 0.023851 -0.977026 0.372187 -0.191950 -0.261494 1.279735 0.743437 0.312942 0.249433 -1.020186 -0.526093 -0.145118 -1.224917 0.013893 0.314861 -0.184936 -0.325166 1.366374 0.274657 0.026926 -0.244763 -0.087459 2.440721 -0.211444 1.791492 -1.783760 1.172869 -1.588578 0.547428 1.236402 0.238765 1.074080 0.971804 1.481358 -0.260144 -0.372863 -1.668835 0.814128 0.459048 -0.537238 -1.363500 -1.937049 0.223611 -0.093947 0.206137 1.323856 -0.881426 0.858833 -0.481817 -1.634059 1.143432 -0.822667 -0.389237 0.754676 -0.474367 1.164979 -1.249431 0.841197 -0.271102 0.239336 -0.874709 -0.484608 1.776312 -0.655397 -0.595401 1.292876 -0.673088 1.183725 1.045448 -0.711502 -0.435949 -0.414408 -0.82087 0.125983 0.092412 0.571425 -1.369651 0.498595 -0.114021 -2.056757 -0.606039 -0.014727 -1.732949 -0.208160 -0.257969 0.336272 0.292738 -1.020895 0.707943 -0.413066 0.015892 -0.870655 0.356665 -1.240625 0.697208 -0.899096 -0.546283 1.346066 0.151550 0.608179 -0.642330 -0.491368 1.476060 -0.239342 0.210075 0.653872 0.124511 -1.450796 0.131710 0.597644 -0.239656 0.151939 -0.989298 1.120132 0.086376 0.172451 -1.515352 -0.422562 1.618894 1.162732 -0.041657 -0.473771 0.420646 -0.482861 0.206311 -0.806355 0.864794 -0.179643 -0.095540
1 male This doctor was absolutely terrible. -1.080321 1.28371 0.032944 -0.505388 -0.632284 0.240778 0.497701 0.061435 -0.951466 -1.099913 0.371787 1.267668 -0.751966 -0.042724 -0.142015 0.127234 -0.733424 -1.139797 -0.325070 0.430322 -0.098003 1.163078 1.057191 0.532064 -0.054027 -0.344783 1.042196 0.132536 0.173455 -0.846880 -0.294927 -1.092173 -0.739157 0.072503 -1.381497 -0.039767 -0.596037 -0.635421 -0.102165 -0.223892 -0.110668 1.610051 0.124494 0.262522 0.471182 0.363986 0.149283 1.757610 -0.095171 -0.828335 -0.169186 -0.167354 0.181549 -0.468074 0.173165 -0.151472 0.153541 -0.070349 -0.070681 1.346811 0.838431 -0.173600 -0.698331 -0.907077 0.686929 -0.253124 -0.253508 -0.816285 0.577228 -0.471221 -0.319504 0.318208 -1.152312 1.608094 0.020384 0.240882 1.051514 -0.431564 -0.734053 0.355925 -0.735061 -1.024491 -0.607373 -0.363772 -1.032263 -0.755497 -1.072544 -0.330346 0.112160 -0.765852 2.702497 -0.059790 -2.331072 -0.261409 0.662297 -0.134803 2.094935 -1.216020 -1.468843 -0.590110 -0.60338 0.032229 -0.734086 -1.041734 -0.096882 0.252745 -0.755397 -0.196471 -0.673407 0.323116 0.485170 0.852232 0.038042 0.106503 1.900742 -0.473967 0.440854 -0.124218 0.818129 -0.249900 0.174284 -2.027710 -0.841279 -0.510335 -1.589422 -0.064430 -0.204133 0.107322 -0.129781 -0.373624 -0.085754 -0.389158 0.630451 0.811590 -1.157425 -0.036668 0.638929 -0.031827 0.162673 -0.745700 0.047340 0.041957 0.455531 1.466351 -0.493203 0.315197 0.956463 0.169743 -0.903741 1.078132 -0.639152 0.206805 -1.212701 0.061929 -1.587088 0.509692 -0.580705 0.743138 0.439221 0.110379 -1.247448 1.323940 0.404472 0.451868 -1.951446 -2.136479 -0.824688 0.520747 0.877290 -0.365677 0.608507 1.291322 0.141776 0.668782 0.493869 -0.911924 -0.265987 -0.342514 0.059860 -0.457266 -0.244779 1.999362 -0.012579 0.12656 -0.443918 1.152567 -0.219919 -0.358424 -0.215556 0.169946 0.193413 0.425413 0.506094 -2.375513 -0.682047 -0.212780 0.261091 -0.382526 -0.423045 0.087569 0.485061 -0.342659 0.455986 0.331639 -1.648495 1.399006 -0.594800 0.471353 -0.741981 0.568691 -0.537344 1.354499 -1.521543 0.222687 0.505541 -0.384466 0.048947 0.243410 -1.003185 0.442602 1.256965 0.718853 1.458385 1.336809 -1.110116 -0.281129 -0.021440 0.969155 -0.324079 -0.551152 -0.346972 -0.426812 -0.909857 -0.224591 0.519270 0.436377 0.557001 0.615947 0.307261 -0.292612 -0.646692 -0.091191 -0.124168 0.044792 0.370954 -1.421038 -1.321085 1.192954 ... -0.160493 -1.280424 -0.769861 0.573255 -1.297933 1.492451 1.244543 0.312218 -0.620741 0.367966 2.416998 2.586342 -1.135545 0.896955 0.391467 -0.674776 0.383278 -0.950579 1.830726 -1.018145 -0.007086 -0.491024 0.520237 0.675352 1.206401 -1.113755 -1.293387 -0.928670 0.735878 0.426821 -0.453120 -0.505470 0.643927 -0.40995 -1.265348 -0.086370 0.149850 -0.014541 0.152579 0.214133 0.190899 0.483519 -0.121118 0.216187 -0.095704 0.484240 -0.256439 0.128705 0.124124 0.442363 -0.328853 0.839023 -0.413679 -0.218301 0.031113 -0.781578 0.877376 0.426150 0.650736 -0.534364 1.324011 -2.276321 -3.209807 0.747674 -0.090332 -0.794742 0.910226 0.064212 0.187118 -0.292772 -0.751870 0.891957 -0.681515 -1.061647 -0.573387 0.548158 -0.167157 -0.570217 -0.115314 0.74787 -0.937214 -0.019236 -1.126544 -0.36322 -1.234232 0.423860 -0.269931 0.576194 0.849582 0.444871 -0.502688 -1.018462 -0.920361 -0.202659 -0.456457 1.216924 -0.185182 0.486068 0.267083 0.585334 0.036501 -0.048680 1.431088 -0.141862 0.566101 -1.238388 0.072949 0.038206 -0.293941 1.536462 0.458765 -0.149625 -0.717819 -0.079780 1.701868 0.439534 -0.174674 0.958560 -0.054750 0.944751 -0.018846 0.701799 -0.769991 0.253059 0.769639 -0.607609 0.696354 0.171144 1.106053 0.268298 -1.047966 0.640153 0.143615 -1.105975 -0.016227 0.142468 0.596630 -0.452742 -0.313862 -0.227833 -0.207952 -0.843667 -1.502774 1.050109 -0.042178 0.633935 -0.994892 -0.309289 -1.750693 -1.035756 -0.893424 0.439106 0.468417 0.332214 0.615565 0.167857 -0.761188 -0.513775 -0.727299 0.233109 0.549182 -1.956708 -0.498497 -0.176335 -1.125635 -0.663087 -0.504848 -0.284806 1.412328 1.304304 -0.673629 1.146111 -1.070052 -0.598914 0.518671 -0.419872 -0.001671 -0.915122 1.048179 1.200089 -1.123845 -0.956010 -0.779800 1.226384 0.299932 0.497792 -0.184536 -0.028380 0.185598 0.613601 -0.006552 -0.340543 0.135926 -0.15309 -0.933909 -0.327589 1.260056 -0.727344 -0.019971 0.352552 -0.667696 -1.120148 -0.257727 0.343015 0.514783 -1.494829 -0.767744 -0.098166 -0.532586 1.300744 0.445362 -0.591071 0.472783 0.128228 -0.951936 -0.301228 -0.829075 0.356493 2.177831 -0.453741 0.180739 -0.366111 0.788272 -0.376016 -0.167796 0.945091 0.318103 -0.313438 -0.521865 -0.804645 0.371298 -0.102799 -0.398658 -0.674932 0.712733 0.402257 -0.189253 -1.744042 0.592452 -0.101447 1.562682 -0.446035 -0.073316 0.778164 -0.670258 0.576500 -0.036423 -0.237192 -0.103963 -0.018754

2 rows × 1026 columns


Autocoder.code_lda_topics

 Autocoder.code_lda_topics (docs, df, k=10, n_features=10000)

Encode texts as semantically meaningful vectors using Latent Dirichlet Alocation

comments = ["What is your favorite sitcom of all time?", 'I cannot wait to vote!']
df = pd.DataFrame({
    'over_18': ['yes', 'no'] * 5,
     'comments' : comments * 5,
      })
df.head()
over_18 comments
0 yes What is your favorite sitcom of all time?
1 no I cannot wait to vote!
2 yes What is your favorite sitcom of all time?
3 no I cannot wait to vote!
4 yes What is your favorite sitcom of all time?
df = ac.code_lda_topics(df['comments'].values, df)
preprocessing texts...
fitting model...
iteration: 1 of max_iter: 5
iteration: 2 of max_iter: 5
iteration: 3 of max_iter: 5
iteration: 4 of max_iter: 5
iteration: 5 of max_iter: 5
done.
done.
df.head()
over_18 comments time|favorite|sitcom sitcom|vote|wait wait|vote|favorite time|sitcom|favorite favorite|sitcom|wait wait|time|favorite sitcom|favorite|vote vote|wait|time favorite|vote|time vote|favorite|sitcom
0 yes What is your favorite sitcom of all time? 0.148763 0.093341 0.080723 0.128911 0.109816 0.084724 0.093611 0.080860 0.091758 0.087493
1 no I cannot wait to vote! 0.085687 0.097749 0.142486 0.084145 0.086931 0.099608 0.091913 0.114741 0.093014 0.103728
2 yes What is your favorite sitcom of all time? 0.148763 0.093341 0.080723 0.128911 0.109816 0.084724 0.093611 0.080860 0.091758 0.087493
3 no I cannot wait to vote! 0.085687 0.097749 0.142486 0.084145 0.086931 0.099608 0.091913 0.114741 0.093014 0.103728
4 yes What is your favorite sitcom of all time? 0.148763 0.093341 0.080723 0.128911 0.109816 0.084724 0.093611 0.080860 0.091758 0.087493

Autocoder.code_callable

 Autocoder.code_callable (docs, df, fn)

Autocodes text for any user-specified function The fn parameter must be a Callable and return a dictionary for each text in docs where the keys are desired column names and values are scores or probabilities.

reviews = ["I loved this doctor!", "This doctor was absolutely terrible."]
df = pd.DataFrame({
    'gender': ['female', 'male'],
     'review' : reviews,
      })
df.head()
gender review
0 female I loved this doctor!
1 male This doctor was absolutely terrible.
def some_function(x):
    val = int('terrible' in x)
    return {'has_the_word_terrible?' : val}
df = ac.code_callable(df.review.values, df, some_function)
df.head()
gender review has_the_word_terrible?
0 female I loved this doctor! 0
1 male This doctor was absolutely terrible. 1