Quastion
I'm looking for a
string.contains
or string.indexof
method in Python.
I want to do:
if not somestring.contains("blah"):
continue
Answers 1
You can use the
in
operator:if "blah" not in somestring:
continue
Answers 2
If it's just a substring search you can use
string.find("substring")
.
You do have to be a little careful with
find
, index
, and in
though, as they are substring searches. In other words, this:s = "This be a string"
if s.find("is") == -1:
print "No 'is' here!"
else:
print "Found 'is' in the string."
It would print
Found 'is' in the string.
Similarly, if "is" in s:
would evaluate to True
. This may or may not be what you want.Answers 3
if needle in haystack:
is the normal use, as @Michael says -- it relies on the in
operator, more readable and faster than a method call.
If you truly need a method instead of an operator (e.g. to do some weird
key=
for a very peculiar sort...?), that would be 'haystack'.__contains__
. But since your example is for use in an if
, I guess you don't really mean what you say;-). It's not good form (nor readable, nor efficient) to use special methods directly -- they're meant to be used, instead, through the operators and builtins that delegate to them.Answers 3
Does Python have a string contains substring method?
Yes, but Python has a comparison operator that you should use instead, because the language intends its usage, and other programmers will expect you to use it. That keyword is
in
, which is used as a comparison operator:>>> 'foo' in '**foo**'
True
The opposite (complement), which the original question asks for, is
not in
:>>> 'foo' not in '**foo**' # returns False
False
This is semantically the same as
not 'foo' in '**foo**'
but it's much more readable and explicitly provided for in the language as a readability improvement.
Avoid using __contains__
, find
, and index
As promised, here's the
contains
method:str.__contains__('**foo**', 'foo')
returns
True
. You could also call this function from the instance of the superstring:'**foo**'.__contains__('foo')
But don't. Methods that start with underscores are considered semantically private. The only reason to use this is when extending the
in
and not in
functionality (e.g. if subclassing str
):class NoisyString(str):
def __contains__(self, other):
print('testing if "{0}" in "{1}"'.format(other, self))
return super(NoisyString, self).__contains__(other)
ns = NoisyString('a string with a substring inside')
and now:
>>> 'substring' in ns
testing if "substring" in "a string with a substring inside"
True
Also, avoid the following string methods:
>>> '**foo**'.index('foo')
2
>>> '**foo**'.find('foo')
2
>>> '**oo**'.find('foo')
-1
>>> '**oo**'.index('foo')
Traceback (most recent call last):
File "<pyshell#40>", line 1, in <module>
'**oo**'.index('foo')
ValueError: substring not found
Other languages may have no methods to directly test for substrings, and so you would have to use these types of methods, but with Python, it is much more efficient to use the
in
comparison operator.Performance comparisons
We can compare various ways of accomplishing the same goal.
import timeit
def in_(s, other):
return other in s
def contains(s, other):
return s.__contains__(other)
def find(s, other):
return s.find(other) != -1
def index(s, other):
try:
s.index(other)
except ValueError:
return False
else:
return True
perf_dict = {
'in:True': min(timeit.repeat(lambda: in_('superstring', 'str'))),
'in:False': min(timeit.repeat(lambda: in_('superstring', 'not'))),
'__contains__:True': min(timeit.repeat(lambda: contains('superstring', 'str'))),
'__contains__:False': min(timeit.repeat(lambda: contains('superstring', 'not'))),
'find:True': min(timeit.repeat(lambda: find('superstring', 'str'))),
'find:False': min(timeit.repeat(lambda: find('superstring', 'not'))),
'index:True': min(timeit.repeat(lambda: index('superstring', 'str'))),
'index:False': min(timeit.repeat(lambda: index('superstring', 'not'))),
}
And now we see that using
in
is much faster than the others. Less time to do an equivalent operation is better:>>> perf_dict
{'in:True': 0.16450627865128808,
'in:False': 0.1609668098178645,
'__contains__:True': 0.24355481654697542,
'__contains__:False': 0.24382793854783813,
'find:True': 0.3067379407923454,
'find:False': 0.29860888058124146,
'index:True': 0.29647137792585454,
'index:False': 0.5502287584545229}
댓글 없음:
댓글 쓰기