Strings in Python

Strings are one of the most commonly used data types in Python. They represent text and come with a rich set of methods for manipulation. Whether you're processing user input, formatting output, or analyzing text data, understanding strings is essential.

What are Strings?

A string is a sequence of characters enclosed in quotes.

# Different ways to create strings
single = 'Hello'
double = "World"
triple_single = '''Multi-line
string'''
triple_double = """Another
multi-line"""

# Strings are immutable
text = "Python"
# text[0] = 'J'  # This would cause an error

print(type(text))  # <class 'str'>

String Creation

# Basic strings
name = "Alice"
message = 'Hello, World!'

# Multi-line strings
poem = """Roses are red,
Violets are blue,
Python is awesome,
And so are you!"""

# Empty string
empty = ""
empty2 = str()

# String with quotes inside
quote1 = "He said, 'Hello!'"
quote2 = 'She said, "Hi!"'
quote3 = "It's a beautiful day"
quote4 = 'The book "1984" is great'

# Escape characters
escaped = "Line 1\nLine 2\tTabbed"
path = "C:\\Users\\Documents"  # Double backslash

String Indexing and Slicing

text = "Python Programming"

# Indexing (0-based)
print(text[0])   # 'P'
print(text[7])   # 'P'
print(text[-1])  # 'g' (last character)
print(text[-2])  # 'n'

# Slicing [start:end]
print(text[0:6])    # 'Python'
print(text[7:18])   # 'Programming'
print(text[:6])     # 'Python' (from start)
print(text[7:])     # 'Programming' (to end)
print(text[:])      # Full string copy

# Slicing with step [start:end:step]
print(text[::2])    # Every 2nd character
print(text[::-1])   # Reverse string
print(text[0:10:2]) # 'Pto rga'

String Operations

# Concatenation
first = "Hello"
last = "World"
full = first + " " + last
print(full)  # "Hello World"

# Repetition
laugh = "Ha" * 3
print(laugh)  # "HaHaHa"

# Length
text = "Python"
print(len(text))  # 6

# Membership
if "Py" in text:
    print("Found!")

if "Java" not in text:
    print("Not found!")

String Methods - Case Conversion

text = "Python Programming"

# Change case
print(text.upper())       # "PYTHON PROGRAMMING"
print(text.lower())       # "python programming"
print(text.capitalize())  # "Python programming"
print(text.title())       # "Python Programming"
print(text.swapcase())    # "pYTHON pROGRAMMING"

# Check case
print("HELLO".isupper())  # True
print("hello".islower())  # True
print("Hello".istitle())  # True

String Methods - Searching

text = "Python is awesome. Python is powerful."

# Find substring
position = text.find("Python")
print(position)  # 0 (first occurrence)

last_pos = text.rfind("Python")
print(last_pos)  # 19 (last occurrence)

# find returns -1 if not found
pos = text.find("Java")
print(pos)  # -1

# Count occurrences
count = text.count("Python")
print(count)  # 2

count = text.count("is")
print(count)  # 2

# Check start/end
print(text.startswith("Python"))  # True
print(text.endswith("powerful.")) # True

# index() - like find() but raises exception if not found
pos = text.index("awesome")
print(pos)  # 10

String Methods - Modifying

text = "  Python Programming  "

# Strip whitespace
print(text.strip())   # "Python Programming"
print(text.lstrip())  # "Python Programming  "
print(text.rstrip())  # "  Python Programming"

# Remove specific characters
text2 = "...Python..."
print(text2.strip('.'))  # "Python"

# Replace
text3 = "I love Java"
print(text3.replace("Java", "Python"))  # "I love Python"

# Replace multiple occurrences
text4 = "one one one"
print(text4.replace("one", "two", 2))  # "two two one"

String Methods - Splitting and Joining

# Split string into list
text = "Python is awesome"
words = text.split()
print(words)  # ['Python', 'is', 'awesome']

# Split by delimiter
csv = "apple,banana,cherry"
fruits = csv.split(',')
print(fruits)  # ['apple', 'banana', 'cherry']

# Split with max splits
text = "a-b-c-d-e"
parts = text.split('-', 2)
print(parts)  # ['a', 'b', 'c-d-e']

# Split lines
multi_line = "Line 1\nLine 2\nLine 3"
lines = multi_line.splitlines()
print(lines)  # ['Line 1', 'Line 2', 'Line 3']

# Join list into string
words = ['Python', 'is', 'awesome']
text = ' '.join(words)
print(text)  # "Python is awesome"

# Join with different separator
csv = ','.join(['apple', 'banana', 'cherry'])
print(csv)  # "apple,banana,cherry"

String Formatting

Old Style (% formatting)

name = "Alice"
age = 25

# %s for strings, %d for integers, %f for floats
message = "My name is %s and I am %d years old" % (name, age)
print(message)

# Format floats
pi = 3.14159
print("Pi is approximately %.2f" % pi)  # "Pi is approximately 3.14"

str.format() Method

name = "Bob"
age = 30

# Positional arguments
message = "My name is {} and I am {} years old".format(name, age)

# Indexed arguments
message = "Hello {0}, you are {1} years old. {0} is a nice name!".format(name, age)

# Named arguments
message = "My name is {name} and I am {age} years old".format(name=name, age=age)

# Format numbers
print("Pi is {:.2f}".format(3.14159))
print("Number: {:,}".format(1000000))  # "Number: 1,000,000"

f-strings (Python 3.6+, Recommended)

name = "Charlie"
age = 35
height = 5.9

# Basic f-string
message = f"My name is {name} and I am {age} years old"

# Expressions in f-strings
print(f"{name} will be {age + 1} next year")

# Format numbers
print(f"Height: {height:.1f} feet")
print(f"Price: ${19.99:.2f}")

# Alignment and width
print(f"{name:<10} | {age:>5}")  # Left and right align

# Call functions in f-strings
print(f"Uppercase: {name.upper()}")

# Multi-line f-strings
message = f"""
Name: {name}
Age: {age}
Height: {height}
"""

String Validation Methods

# Check string content
print("123".isdigit())      # True - all digits
print("abc".isalpha())      # True - all letters
print("abc123".isalnum())   # True - letters and digits
print("   ".isspace())      # True - all whitespace
print("Hello".isidentifier()) # True - valid Python identifier

# More validation
print("hello world".islower())  # True
print("HELLO WORLD".isupper())  # True
print("123".isnumeric())        # True
print("Hello123".isascii())     # True

String Alignment and Padding

text = "Python"

# Center
print(text.center(20))      # "       Python       "
print(text.center(20, '*')) # "*******Python*******"

# Left justify
print(text.ljust(20))       # "Python              "
print(text.ljust(20, '-'))  # "Python--------------"

# Right justify
print(text.rjust(20))       # "              Python"
print(text.rjust(20, '.'))  # "..............Python"

# Zero padding (numbers)
num = "42"
print(num.zfill(5))  # "00042"

Working with Characters

# Iterate through characters
for char in "Python":
    print(char)

# ASCII values
print(ord('A'))  # 65
print(ord('a'))  # 97

# Character from ASCII
print(chr(65))   # 'A'
print(chr(97))   # 'a'

# Check character properties
char = 'A'
print(char.isupper())  # True
print(char.isdigit())  # False
print(char.isalpha())  # True

String Encoding and Decoding

# Encode string to bytes
text = "Hello, World!"
encoded = text.encode('utf-8')
print(encoded)  # b'Hello, World!'

# Decode bytes to string
decoded = encoded.decode('utf-8')
print(decoded)  # "Hello, World!"

# Different encodings
text = "Café"
utf8 = text.encode('utf-8')
latin1 = text.encode('latin-1')
print(utf8)    # b'Caf\xc3\xa9'
print(latin1)  # b'Caf\xe9'

Practical Examples

Example 1: Email Validator

def is_valid_email(email):
    """Simple email validation"""
    if '@' not in email:
        return False
    
    if email.count('@') != 1:
        return False
    
    if email.startswith('@') or email.endswith('@'):
        return False
    
    local, domain = email.split('@')
    
    if len(local) == 0 or len(domain) == 0:
        return False
    
    if '.' not in domain:
        return False
    
    return True

# Test
emails = ['user@example.com', 'invalid', '@test.com', 'test@']
for email in emails:
    print(f"{email}: {is_valid_email(email)}")

Example 2: Text Statistics

def analyze_text(text):
    """Analyze text and return statistics"""
    stats = {
        'characters': len(text),
        'letters': sum(c.isalpha() for c in text),
        'digits': sum(c.isdigit() for c in text),
        'spaces': text.count(' '),
        'words': len(text.split()),
        'lines': len(text.splitlines()),
        'uppercase': sum(c.isupper() for c in text),
        'lowercase': sum(c.islower() for c in text)
    }
    return stats

text = """Python is a high-level programming language.
It was created in 1991 by Guido van Rossum."""

stats = analyze_text(text)
for key, value in stats.items():
    print(f"{key}: {value}")

Example 3: Title Case Converter

def to_title_case(text):
    """Convert to title case, handling special words"""
    small_words = {'a', 'an', 'the', 'and', 'but', 'or', 'for', 'nor', 'on', 'at', 'to', 'by', 'of', 'in'}
    
    words = text.lower().split()
    
    # Capitalize first word
    if words:
        words[0] = words[0].capitalize()
    
    # Capitalize other words unless they're small words
    for i in range(1, len(words)):
        if words[i] not in small_words:
            words[i] = words[i].capitalize()
    
    return ' '.join(words)

# Test
print(to_title_case("the lord of the rings"))
# "The Lord of the Rings"

Example 4: Password Strength Checker

def check_password_strength(password):
    """Check password strength"""
    score = 0
    feedback = []
    
    if len(password) >= 8:
        score += 1
    else:
        feedback.append("Use at least 8 characters")
    
    if any(c.isupper() for c in password):
        score += 1
    else:
        feedback.append("Add uppercase letters")
    
    if any(c.islower() for c in password):
        score += 1
    else:
        feedback.append("Add lowercase letters")
    
    if any(c.isdigit() for c in password):
        score += 1
    else:
        feedback.append("Add numbers")
    
    if any(c in '!@#$%^&*' for c in password):
        score += 1
    else:
        feedback.append("Add special characters")
    
    strength = ['Very Weak', 'Weak', 'Fair', 'Good', 'Strong', 'Very Strong'][score]
    
    return {
        'score': score,
        'strength': strength,
        'feedback': feedback
    }

# Test
result = check_password_strength("Passw0rd!")
print(f"Strength: {result['strength']}")
for tip in result['feedback']:
    print(f"- {tip}")

Example 5: Parse CSV Data

def parse_csv_line(line):
    """Parse a CSV line handling quoted fields"""
    fields = []
    current_field = ""
    in_quotes = False
    
    for char in line:
        if char == '"':
            in_quotes = not in_quotes
        elif char == ',' and not in_quotes:
            fields.append(current_field.strip())
            current_field = ""
        else:
            current_field += char
    
    fields.append(current_field.strip())
    return fields

# Test
csv_line = 'John,Doe,"123 Main St, Apt 4",New York,NY'
fields = parse_csv_line(csv_line)
print(fields)
# ['John', 'Doe', '"123 Main St, Apt 4"', 'New York', 'NY']

Common String Patterns

Pattern 1: Reverse a String

text = "Python"
reversed_text = text[::-1]
print(reversed_text)  # "nohtyP"

Pattern 2: Check Palindrome

def is_palindrome(text):
    cleaned = ''.join(c.lower() for c in text if c.isalnum())
    return cleaned == cleaned[::-1]

print(is_palindrome("A man a plan a canal Panama"))  # True
print(is_palindrome("race car"))  # True
print(is_palindrome("hello"))  # False

Pattern 3: Remove Duplicates

def remove_duplicate_chars(text):
    return ''.join(dict.fromkeys(text))

print(remove_duplicate_chars("hello"))  # "helo"

Pattern 4: Count Words

def count_words(text):
    return len(text.split())

text = "Python is an amazing programming language"
print(f"Word count: {count_words(text)}")  # 6

Common Mistakes

Mistake 1: Strings are Immutable

# WRONG: Can't modify string directly
text = "Python"
# text[0] = 'J'  # TypeError

# CORRECT: Create new string
text = 'J' + text[1:]
print(text)  # "Jython"

Mistake 2: Concatenating in Loops

# INEFFICIENT
result = ""
for i in range(1000):
    result += str(i)

# EFFICIENT: Use join
result = ''.join(str(i) for i in range(1000))

Mistake 3: Using == vs is

# Use == for value comparison
str1 = "hello"
str2 = "hello"
print(str1 == str2)  # True (correct)

# 'is' checks identity, not value
print(str1 is str2)  # May be True due to string interning, but don't rely on it

Practice Exercises

Exercise 1: Acronym Generator

Create function that generates acronym from phrase Example: "Python Programming Language" → "PPL"

Exercise 2: Word Wrapper

Wrap text to specified line length

Exercise 3: String Compressor

Compress: "aaabbcccc" → "a3b2c4"

Sample Solutions

Exercise 1:

def create_acronym(phrase):
    words = phrase.split()
    return ''.join(word[0].upper() for word in words)

print(create_acronym("Python Programming Language"))  # "PPL"

Exercise 2:

def wrap_text(text, width):
    words = text.split()
    lines = []
    current_line = []
    current_length = 0
    
    for word in words:
        if current_length + len(word) + len(current_line) <= width:
            current_line.append(word)
            current_length += len(word)
        else:
            lines.append(' '.join(current_line))
            current_line = [word]
            current_length = len(word)
    
    if current_line:
        lines.append(' '.join(current_line))
    
    return '\n'.join(lines)

Key Takeaways

Strings are immutable sequences of characters
Use f-strings for modern string formatting
Rich set of methods for manipulation and validation
String slicing is powerful for extracting substrings
Use join() for efficient concatenation
Common operations: split(), strip(), replace(), find()
Strings can be iterated character by character

What's Next?

Lists and Tuples - Store collections of strings
Loops - Process strings iteratively
Functions - Create reusable string operations

Strings are everywhere in programming. Master string manipulation and you'll handle text data with confidence!