PERF: Vectorized string operations are slower than for-loops

- [X] I have checked that this issue has not already been reported.

- [X] I have confirmed this bug exists on the latest version of pandas.

- [X] (optional) I have confirmed this bug exists on the master branch of pandas.

---


#### Code Sample, a copy-pastable example

**In [1]:**
```python
import pandas as pd
import numpy as np
print(pd.__version__)

non_padded = pd.Series(np.random.randint(100, 99999, size=10000))

def for_loop(series):
    return pd.Series([str(zipcode).zfill(5) for zipcode in series])

def vectorized(series):
    return series.astype(str).str.zfill(5)

%timeit for_loop(non_padded)
%timeit vectorized(non_padded)

```

**Out [1]:**
```
0.25.1
3.32 ms ± 44.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
7.18 ms ± 60.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
```

#### Problem description

In most cases, using a `for-loop` in pandas is much slower than its vectorized equivalent. However, the above operations takes over twice as long when using vectorization. I have replicated this issue on MacOS & Ubuntu.

#### Output of ``pd.show_versions()``

<details>

INSTALLED VERSIONS
------------------
commit           : None
python           : 3.7.4.final.0
python-bits      : 64
OS               : Darwin
OS-release       : 18.7.0
machine          : x86_64
processor        : i386
byteorder        : little
LC_ALL           : None
LANG             : en_US.UTF-8
LOCALE           : en_US.UTF-8

pandas           : 0.25.1
numpy            : 1.17.2
pytz             : 2019.3
dateutil         : 2.8.0
pip              : 20.0.2
setuptools       : 41.4.0
Cython           : 0.29.13
pytest           : 5.2.1
hypothesis       : None
sphinx           : 2.2.0
blosc            : None
feather          : None
xlsxwriter       : 1.2.1
lxml.etree       : 4.4.1
html5lib         : 1.0.1
pymysql          : None
psycopg2         : None
jinja2           : 2.10.3
IPython          : 7.8.0
pandas_datareader: None
bs4              : 4.8.0
bottleneck       : 1.2.1
fastparquet      : None
gcsfs            : None
lxml.etree       : 4.4.1
matplotlib       : 3.1.1
numexpr          : 2.7.0
odfpy            : None
openpyxl         : 3.0.0
pandas_gbq       : None
pyarrow          : 0.15.1
pytables         : None
s3fs             : None
scipy            : 1.3.1
sqlalchemy       : 1.3.9
tables           : 3.5.2
xarray           : 0.14.0
xlrd             : 1.2.0
xlwt             : 1.3.0
xlsxwriter       : 1.2.1

</details>



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

PERF: Vectorized string operations are slower than for-loops #35864

Code Sample, a copy-pastable example

Problem description

Output of `pd.show_versions()`

INSTALLED VERSIONS

8 remaining items

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

PERF: Vectorized string operations are slower than for-loops #35864

Description

Code Sample, a copy-pastable example

Problem description

Output of pd.show_versions()

INSTALLED VERSIONS

Activity

dsaxton commented on Aug 23, 2020

asishm commented on Aug 24, 2020

dsaxton commented on Aug 24, 2020

bashtage commented on Aug 24, 2020

asishm commented on Sep 12, 2020

topper-123 commented on Sep 13, 2020

8 remaining items

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Issue actions

Output of `pd.show_versions()`