Lab 2
- [1 `check_char_in_string.py`](#1
check_char_in_string.py
) - [2 `count_char_occurrence.py`](#2
count_char_occurrence.py
) -
- [2.1 counting character](#2.1 counting character)
- [2.2 counting substring](#2.2 counting substring)
-
- [2.2.1 problem](#2.2.1 problem)
- [2.2.2 reason](#2.2.2 reason)
- [2.2.3 solution](#2.2.3 solution)
- [3 `get_char_index.py`](#3
get_char_index.py
) -
- 3.1
- [3.2 improved program](#3.2 improved program)
- [4 `remove_leading_ending_spaces.py`](#4
remove_leading_ending_spaces.py
) - [5 `extract_substring.py`](#5
extract_substring.py
) -
- 5.1
- [5.2 index out of bound](#5.2 index out of bound)
- [5.3 negative index](#5.3 negative index)
- [6 `list_operations.py`](#6
list_operations.py
) -
- 6.1
- [6.2 `add_element_to_end()`](#6.2
add_element_to_end()
)
1 check_char_in_string.py
running result:
1.1
- different arguments are tried as you can see in the screenshot above
- every result matches the expectation
1.2
The keyword in
in the check_char_in_string
function is used to check if a specific character is present in the given string.
It is a membership operator in Python that returns True
if the character is found in the string, and False
otherwise. The keyword in
simplifies the process of checking for the existence of a character in a string by providing a concise way to perform this operation.
2 count_char_occurrence.py
running result:
2.1 counting character
When counting a character:
- every result matches the expectation
- When the character is not in the input string: the result is 0 .
for example: runcount_char_occurrence("Fudan python", "f")
→ \rightarrow → return0
- When the character is in the input string: the result is the number it occurs .
for example: runcount_char_occurrence("Fudan python", "n")
→ \rightarrow → return2
- When the character is not in the input string: the result is 0 .
2.2 counting substring
When counting substring:
- run
count_char_occurrence("Fudan python", "python")
→ \rightarrow → return1
- run
count_char_occurrence("aaaa", "aa")
→ \rightarrow → return2
- run
count_char_occurrence("ababa", "aba")
→ \rightarrow → return1
2.2.1 problem
The result shows that count_char_occurrence
function also works for counting substring, but there still exists a problem.
- in the first example, the result can meet the expected answer
- while in the second one, the result should be
3
but the function returns2
instead
also in the last example, the result should be2
but it returns1
Interestingly, we can see that:
count_char_occurrence
function can be used to count substring ,
But when we use count()
method in Python to count the occurrences of overlapping substrings within a string, it does not account for overlapping occurrences of substrings, leading to a lower count than expected in certain cases.
2.2.2 reason
The reason for this issue is that the count()
method doesn't consider the ending position of the previous match but continues searching from the current position. This can cause some overlapping substrings to be missed in the count.
2.2.3 solution
To solve this problem, we can use regular expressions to match overlapping substrings.
python
import re
def count_char_occurrence_improved(input_string, char_to_check):
pattern = re.compile(f'(?={char_to_check})')
matches = re.findall(pattern, input_string)
return len(matches)
This modified function uses the "lookahead" mechanism of regular expressions, which allows it to match overlapping substrings.
running result:
3 get_char_index.py
running result:
3.1
-
When the character or substring is not in the input string: raise a
ValueError
exceptionfor example:
run
get_char_index("Fudan python at 1:30pm", " ")
→ \rightarrow →ValueError: substring not found
-
When the character or substring is in the input string: the result is the index of the first occurrence of the specific character within a given string .
for example:
run
get_char_index("Fudan python at 1:30pm", "p")
→ \rightarrow → return6
run
get_char_index("Fudan python at 1:30pm", " ")
→ \rightarrow → return5
and it's same when working on substrings
run
get_char_index("Fudan python at 1:30pm", "at")
→ \rightarrow → return13
run
get_char_index("Fudan python at 1:30pm to 4:10pm", "pm")
→ \rightarrow → return20
3.2 improved program
We can use a loop to iterate through the entire string to record the index each time it appears.
python
def get_char_indices(input_string, char_to_check):
indices = []
for i in range(len(input_string)):
if input_string[i] == char_to_check:
indices.append(i)
return indices
running result:
However, this program only supports retrieving all positions of a character in a target string and does not support retrieving substrings. Since the method for implementing the latter is similar, I will skip this part.
4 remove_leading_ending_spaces.py
running result:
4.1
run remove_leading_ending_spaces(" Fudan python ")
→ \rightarrow → return 'Fudan python'
run remove_leading_ending_spaces(" ")
→ \rightarrow → return ''
run remove_leading_ending_spaces("")
→ \rightarrow → return ''
5 extract_substring.py
running result:
5.1
- different arguments are tried as you can see in the screenshot above
- all results meet the expectation
5.2 index out of bound
When the start index or end index goes out of bounds, the slice operation in Python automatically adjusts the index values instead of throwing an error.
In the second function call (run extract_substring("Fudan python", 10, 100)
→ \rightarrow → return 'on'
), when the end index exceeds the length of the string, it is adjusted to the last index of the string. Therefore, it doesn't raise an error but instead returns the substring 'on'
.
5.3 negative index
According to the third function call (run extract_substring("Fudan python", -2, 100)
→ \rightarrow → return 'on'
), the output is as expected.
When the start index is negative, it specifies an offset relative to the end of the string . For example, -2
represents the second-to-last character, so the returned result is 'on'
from the string.
6 list_operations.py
6.1
running result:
The first four functions are very similar to what have already been done for strings above. Elements in a list have similar properties as characters in a string. For example, when using the index()
method, it only returns the index of the first occurrence of the specified character or element.
The only difference between them is:
It is more convenient to search for a substring in a string, while in a list, it is impossible to search for a sublist by using the code above.
6.2 add_element_to_end()
running result:
- different arguments are tried as you can see in the screenshot above
- every result matches the expectation