CPT304-2425-S2-Software Engineering II

Week1 Software Crisis

Software crisis is a term used in the early days of computing science for the difficulty of writing useful and efficient computer programs in the required time. The software crisis was due to the rapid increases in computer power and the complexity of the problems.

No silver Bullet

Brooks argues that there is no single breakthrough---no "silver bullet"---that will dramatically improve software development productivity or reliability by an order of magnitude within a decade.

can be solved? - no

Software is inherently hard due to essential difficulties. Tech helps reduce surface (accidental) issues, but core challenges remain. A mix of reuse, prototyping, and great designers may help.

Essential difficulties*

are the inherent challenges associated with the nature of software itself. （内在）

e.g requirements and frequently changing business environments.

Complexity 复杂性

Inherent complexity from vast states and interactions
Difficulties:
Communication problems -- harder for team members to understand each other.
State explosion -- too many possible states, making the system unreliable, and can't be fully understand. complex functions are hard to use correctly.
Extension difficulties -- adding new features may cause side effects.
Security risks -- hidden states can become trapdoors.
Management challenges -- hard to maintain conceptual integrity and control dependencies.

Conformity 一致性

Must align with complex, evolving human systems(laws, processes, etc.)

Software is expected to conform due to its flexibility andlate arrival （之前的制度（比如业务流程、法律规定、用户习惯等）早已存在）

Conformity:

High flexibility through easy code changes

Modular design minimizes impact of changes

Easy to adapt and maintain

Changeability 可变性

Constantly changes due to evolving requirements, technologies, and environments.

Successful software is always modified and often outlives its original platform

Invisibility 不可见性

No physical form; difficult to visualize and conceptualize.

Accidents偶发性困难

Accidental difficulties are challenges not inherent to the nature of software, but instead emerge from the limitations of existingtechnologies, tools, and development environments . （外在）

e.g intricacies of a programming language syntax and inefficient development process

Promising Attacks

Build vs. Buy

n Prefer buying over building

n Off-the-shelf（现成的） for mass market

n Immediate delivery, with less errors

n Low cost, cost distributed among users

n May not fit all needs

n Shift from software company to consulting company.

Requirement Refinement & Prototyping

address the hardest challenge in software development: precisely defining what to build.

n Impossible to specify completely, precisely, and correctly the exact requirements.

n 应采用**增量式开发(**Incremental development)--grow, don't build, reduces risks and improves alignment with user needs.

Contemporary technologie solution**

how contemporary practices like Agile and AI-enhanced tools are used to tackle essential difficulties identified by Fredrick Brooks.

Agile Methodology:
Complexity: Breaks work into small sprints; uses frequent feedback to reduce confusion.
Conformity: Collaborates with customers to ensure software fits needs.
Changeability: Welcomes changing requirements, adapts quickly.
Invisibility: Uses transparent processes (daily meetings, task boards) to increase visibility.

AI-Enhanced Tools:
Complexity: Automates code analysis; models system behaviors.
Conformity: Uses NLP to help meet regulations.
Changeability: Predicts code areas needing change for proactive updates.
Invisibility: Creates visualizations of software structure and flows.

Week 2 Object-oriented Concepts

**Four Pillar of OOP:**1. Abstraction 2. Encapsulation 3. Inheritance 4. Polymorphism

A software has good design if it's easy and low-cost to change.

1. Hide inherent Complexity Abstraction, encapsulation, inheritance

2. Remove Accidental ComplexitySimplification, reusability, design patterns

High Cohesion

4. Loose Coupling

Cohesion ＆ Coupling

Cohesion

The code is narrow and focus; It doesonly one thing and does it well.

High Cohesion

Each module focuses on one task and does it well.

Related code stays together, reducing change frequency.

Coupling

The degree of dependency in the code

Loose Coupling
Minimize dependencies between components.

Depend on interfaces (not classes), and use composition to reduce tight coupling.

SOLID Principles

SOLID is a mnemonic for five design principles intended to make software designs more understandable, flexible and maintainable.

S -- Single Responsibility Principle*

• A class should have only one reason to change.

• A class should have one clear responsibility, fully encapsulated, to reduce complexity and make changes safer.

• A class with many responsibilities is harder to understand, risky to change, cluttered with code, and difficult to navigate or locate specific parts. (Hard to reuse, Lead to duplication, Difficult to understand, Making change to the code becomes riskier, Often clutter with codes, Hard to test. Too many possibilities, Difficult to navigate and hard to find a specific code, Involve in the modification frequently, Difficult to debug)

O -- Open/closed Principle*

Software entities (classes, modules, functions, etc.) should be open for extension, but closed for modification.
**main idea :**Add new features without changing existing code.

• Inheritance supports Open/Closed Principle but can causetight coupling if subclasses rely on parent details.

• Use interfaces instead of superclasses to allow flexible implementations without changing existing code. Interfacesstay closed to modification while letting you add new implementations to extend functionality.

L -- Liskov Substitute Principle

Inheritance should be used only for substitutability. Inheritance（derived class） should be used only when the subclass can fully replace( require no more and promise no less**)** the parent class.

Use inheritance when B is-a A ( object of B should be used anywhere an object of A is used).

Use composition when B uses A (object of B should use an object of A).

why:

The user of a base class should be able to use an instance of a derived class without knowing the difference.

Subclasses should work in place of parent classes without breaking client code, maintaining compatible behavior.

Rule 1 - Parameter Types: Subclass method parameters should be as general as or more abstract than those in the base class.
Rule 2 - Return Types: Subclass method return types should match or be a subtype of the base class return type.

Rule 3 - Exception Handling: Subclass methods shouldn't throw exceptions that the base class doesn't throw.

Rule 4 -**Preconditions(前提条件):**A subclass shouldn't strengthen pre-conditions.

Rule 5 - Postconditions: A subclass shouldn't weaken post-conditions.

Rule 6 - A subclass must preserve the invariants of the superclass (e.g., object properties that are always true in valid states).

• This is probably the least formal rule of all.

• A class invariant is an assertion concerning object properties that must be true for all valid states of the object.

• The rule on invariants is the easiest to violate because you might misunderstand or not realize all of the invariants of a complex class.

• The safest way to extend a class is to add new fields or methods in the subclass, without modifying existing ones from the superclass.

Rule 7 - State Changes: A subclass shouldn't modify private/protected states of the superclass.

LSP importance

Adhering to LSP promotes flexible and reusable code.

Violating LSP creates tight coupling and complicates maintenance.

If client code cannot freely substitute base class references with subclass objects, it requires instanceof checks and special handling, leading to maintenance difficulties.

Code that follows LSP enables the right abstractions and better code maintainability.

I -- Interface Segregation Principle

不要强迫类去实现它们不需要的接口功能

做法（interface拆开）

Clients should not be forced to depend on interfaces they don't use. Keep interfaces narrow so clients implement only needed behaviors

• This principle is easy to violate as software grows with more features.

• Like Single Responsibility Principle, it reduces complexity and change impact by splitting software into independent parts.

D -- Dependency inversion Principle

高层模块和底层模块都应该依赖于抽象（接口），而不是彼此直接依赖。

做法：抽象化

High-level modules contain complex business logic and should be reusable and stable.
Low-level modules provide basic operations like disk access, networking, and database connection.

To decouple them, introduce abstractions (interfaces) that both depend on.

Dependency Inversion Principle states:

High-level modules should not depend on low-level modules directly，Both should depend on abstractions.
Abstractions should not depend on details; details depend on abstractions.

implementation：

Develop high-level modules first, depending on abstractions.

Then implement low-level modules by fulfilling those abstractions.

This approach improves flexibility and maintainability.

Composition

Use composition as an alternative to multiple inheritance

组合不需要为所有可能的功能组合创建大量子类，减少类的数量和复杂度

Composition makes dynamic change possible (Think about polymorphism)

Use composition to avoid the 'combinatorial explosion(组合爆炸)'

Week 3-4 Design Pattern

Design patterns are typical solutions to commonly occurring problems in software design.

Think of them as customizable blueprints , not copy-paste code.
Offer guidance for structuring code in reusable and maintainable ways.

Key Elements of a Pattern:()

Intent : Briefly describes the problem and the proposed solution.
Motivation : Explains the context and how the pattern solves the problem.
Structure : Class diagrams showing relationships.
Code Example : Real-world sample in a common programming language.
Consequences: Benefits and trade-offs of using the pattern.

Use()

Use patterns as a communication tool , not a starting point.

Don't force them---let your design evolve toward them naturally.

Can serve as lessons---what to do or avoid.

Benefits
Reusability : Speeds up development by reusing solutions.
Best Practices : Reflect professional experience.
Maintainability : Leads to clearer, more organized code.
Team Communication: Provides a shared vocabulary.

Drawbacks
Overuse : Can introduce unnecessary complexity.
Initial Overhead: Initial Learning Curve May slow down early development.
Specificity, Not Universal : Patterns are situational, not one-size-fits-all.
Dependence: Limits Creativity, Can discourage alternative approaches.
Difficulty in Learning:Some patterns are complex to learn or implement.

Creational patterns

创建对象的方式 , 更灵活、更高效地创建实例。

provide object creation mechanisms that increase flexibility and reuse of existing code.

核心思想：隐藏对象创建逻辑，而不是使用 new 直接实例化。

典型应用场景：希望控制对象的创建过程、延迟创建、复用对象等。

Advantages

Flexibility: Decouple the system from the specifics of how objects are created, composed, or represented.
Reusability: Centralized object creation (e.g., factories) reduces code duplication and encourages reuse across the codebase.

Abstraction: Hide the details of object instantiation---clients don't need to know the exact class being created.
Control Over Object Creation : Encapsulate object creation logic in dedicated classes or methods, making it easier to manage, extend, or modify without affecting the rest of the system

模式	核心目的	对象数量	用途示例
Factory	根据条件返回子类，让子类决定实例化哪一个类	多个	图形创建器，返回不同图形
Abstract Factory	创建一组相关对象	多个	UI 工厂（按钮+输入框+窗口）
Builder	分步骤构造复杂对象	多个	构建汉堡、文档、房子等
Prototype	复制已有对象生成新对象	多个	地图元素克隆、角色模板
Singleton	全局唯一实例	只有一个	配置中心、日志系统

The Abstract Factory Pattern

provides an interface to create and return one of several families of related objects.

The Builder Pattern

as name implies, is an alternative way to construct complex objects. This should be used only when you want to build different immutable objects using same object building process.

The Prototype Pattern

starts with an initialized and instantiated class and copies or clones it to make new instances rather than creating new instances.

The Singleton Pattern

is a class of which there can be no more than one instance. It provides a single global point of access to that instance.

The Factory Pattern*

Define an interface for creating an object, but letsubclasses decide which class to instantiate. The Factory method lets a class defer instantiation it uses to subclasses

* This pattern is particularly useful when a class cannot anticipate the class of objects it needs to create, or when a class wants its subclasses to specify the objects it creates.

Structural patterns

如何组织类和对象, 让系统更灵活、更易维护。

explain how to assemble objects and classes into larger structures, while keeping the structures flexible and efficient.

**核心思想：**通过组合类或对象来形成更大的结构。

典型应用场景： 优化类之间的关系、封装接口、支持可扩展性。

Flexibility: Enable components to be modified, replaced, or extended independently , without affecting the entire system.
Code Reusability: Provide reusable, proven solutions to common structural problems in software design.
Improved Modularity : Help decouple components and define clear interfaces, improving modularity and separation of concerns
Code Maintenance: Encourage well-organized, understandable class and object structures---making code easier to read and update.

模式	核心目的	关键词	常见用途
Adapter	让两个接口兼容	接口转换	老系统对接新模块
Bridge	抽象与实现解耦	二维扩展、解耦	避免类爆炸，如颜色+形状组合
Composite	对象树结构统一处理	统一接口	文件系统、组织结构
Decorator	动态增强功能	灵活扩展、不改源码	UI 组件、文本格式处理
Facade	简化系统接口	封装复杂系统	提供简单接口给复杂模块
Flyweight	内存优化，复用共享对象	状态分离与共享	游戏地图对象、字符渲染
Proxy	控制访问，添加功能	权限、缓存、远程	虚拟代理、安全代理、远程代理等

Adapter Pattern

acts as a connector between two incompatible interfaces that otherwise cannot be connected directly. An adapter wraps an existing class with a new interface so that it becomes compatible with the client's interface.

Bridge Pattern

is to decouple an abstraction from its implementation so that the two can vary independently.

Composite Pattern

is meant to allow treating individual objects and compositions of objects, or "composites" in the same way.

Facade Pattern

encapsulates a complex subsystem behind a simple interface. It hides much of the complexity and makes the subsystem easy to use.

Flyweight Pattern

allows programs to support vast quantities of objects by keeping their memory consumption low. Pattern achieves it by sharing parts of object state between multiple objects.

Proxy Pattern

specifies a design where substitute or placeholder object is put in-place of the actual target object to control access to it. Client accesses the proxy object to work with the target object.

Decorator Pattern*

动态地给一个对象添加一些额外的职责

when you need to be able to assign extra behaviors to objects at runtime without breaking the code that uses these objects.

Use the pattern when it's awkward or not possible to extend (class with final keyword) an object's behavior using inheritance. *不想用继承扩展功能

1️⃣ Robot 接口：定义所有机器人的行为

java 复制代码

public interface Robot {
    void Move(int x, int y, int speed);
    void Cook();
}

2️⃣ 基础机器人类：ChineseRobot

这是最基础的机器人，没有智能判断。

java 复制代码

class ChineseRobot implements Robot {
    public void Move(int x, int y, int speed) { ... }
    public void Cook() {
        System.out.println("Cooking Chinese food");
    }
}

3️⃣ 装饰器抽象类 RobotDecorator

它本身不做任何事，只是中转调用，是"基础装饰器"。

java 复制代码

public class RobotDecorator implements Robot {
    protected Robot r;

    public RobotDecorator(Robot r) {
        this.r = r; // 把真正的机器人"包起来"
    }

    public void Move(int x, int y, int speed) {
        r.Move(x, y, speed); // 把调用交给真实机器人
    }

    public void Cook() {
        r.Cook(); // 把调用交给真实机器人
    }
}

4️⃣ 功能增强类 RationalRobotDecorator

这个类就实现了功能增强（根据性别改变份量）

java 复制代码

public class RationalRobotDecorator extends RobotDecorator {
    private boolean gender;

    public RationalRobotDecorator(Robot r, boolean gender) {
        super(r); // 让装饰器知道谁是真正的机器人
        this.gender = gender; // true 表示女生
    }

    public void Cook() {
        if (gender) {
            System.out.println("I will cook you a small portion");
        } else {
            System.out.println("I will cook you a big portion");
        }

        super.Cook(); // 最后调用原始机器人的 cook
    }
}

5️⃣ 主程序用法

这时候，机器人 r 是一个"加了小份逻辑"的 ChineseRobot。

java 复制代码

Robot r = new ChineseRobot();
r = new RationalRobotDecorator(r, true); // true 表示女生
play.AddRobot(r);

Behavioral patterns

行为和通信逻辑,关注"对象之间如何交互、职责如何分配"。

take care of effective communication and the assignment of responsibilities between objects.

核心思想：解耦对象之间的通信，使行为更灵活。

典型应用场景：处理复杂的对象通信逻辑、职责划分、运行时行为变化。

Responsibility Delegation: Encapsulate behavior inside objects and delegate tasks to them, instead of hardcoding logic.
Flexible Communication： Define how objects are connected and interact, ensuring they can communicate easily while staying loosely coupled .
**Reduced Dependencies:**Decouple implementation from the client to avoid rigid code and make the system easier to extend or change.

Pattern	核心功能	解耦方式	典型场景
Chain of Responsibility	请求传递给下一个处理者	发送者与多个接收者之间解耦	事件处理链、审批流程
Command	请求封装为对象	发出请求者与执行者解耦	UI 按钮行为、任务队列、命令撤销
Interpreter	表达式语法解释	表达式和上下文解耦	DSL、正则解析器、表达式求值
Iterator	顺序访问集合元素	遍历与数据结构解耦	列表/集合/图遍历
State	行为随状态改变	状态与行为解耦	状态机、工作流引擎
Strategy	动态替换算法策略	算法与上下文解耦	策略选择、运行时行为切换
Visitor	分离数据结构和操作	操作与结构解耦	复杂对象结构上执行统一操作

Chain of Responsibilities

A chain of objects is created to deal with the request so that no request goes back unfulfilled

Command.

Command pattern deals with requests by hiding it inside an object as a command and send to the invoker object which then passes it to an appropriate object that can fulfill the request.

Interpreter.

Interpreter pattern is used for language or expression evaluation by creating an interface that tells the context for interpretation.

Iterator.

Iterator pattern is used to provide sequential access to a number elements present inside a collection object without any relevant information exchange.

Visitor.

A Visitor performs a set of operations on an element class and changes its behavior of execution. Thus, the variance in the behavior of element class is dependent on the change in visitor class.

State.*

对象的行为依赖于其内部状态，而不是由外部控制。

关注的是对象的状态如何影响其行为，当状态变化时，对象的行为也变化。适用于有多个状态、每个状态下行为不同的情况。

一个订单对象有多个状态：创建、支付、发货，每个状态行为不同：

In State pattern, the behavior of a class varies with its state and is thus represented by the context object. Allows an object to alter its behavior when its internal state changes. The object will appear to change its class.

Key Components

Context

Holds a reference to the current state and delegates behavior to it.

It has a method (setter) to switch to a new state.

State Interface

Defines the methods that all concrete state classes must implement.

This represents the behavior that changes depending on the state.

Concrete States

Implement the specific behavior for each state.

If multiple states share similar logic, you can use an abstract class to group common behavior.

State Transitions

Both the Context and the current State can trigger a transition by replacing the current state with a new one.

Strategy. *

定义一系列算法，将每个算法封装起来，并让它们可以相互替换，使得算法的变化不会影响使用算法的客户端。The Strategy pattern defines a family of algorithms, encapsulates each one, and makes them interchangeable. This pattern allows the algorithm to vary independently from the clients that use it.

关注的是在不同的算法之间进行选择和替换，每个算法封装成一个类，可以根据需要动态切换算法。

你想实现一个压缩文件功能，用户可以选择 zip 或 rar 算法

Strategy pattern deals with the change in class behavior at runtime. The objects consist of strategies and the context object judges the behavior at runtime of each strategy.

Key Components

•Context: Maintains a reference to a concrete strategy and communicate with it via interface.

•Strategy interface: This is an interface common to different algorithms.

•Concrete strategy: Provide its own implementations for the interface.

•Client: Creates a specific strategy object and passes it to the context

Benefits

• Flexibility: It allows the client to choose and change the sorting algorithm at runtime without altering the client code.

• Maintainability : New sorting strategies can be added easily without modifying existing classes, adhering to the Open/Closed Principle.

• Reusability: Different sorting strategies are encapsulated in their own classes, making them reusable across different context.

Application

Use the Strategy Pattern when:

You have multiple ways to do something (e.g., different sorting, payment, or compression algorithms).

You want tochoose the algorithm dynamically at runtime instead of hardcoding logic with if or switch.

Week 5 Software Reuse

Software reuse is the practice of using existing software components (code, designs, etc.) to build new system

benefit:

Cost savings, Faster delivery, Higher quality, Specialist knowledge

challenges:

Maintenance costs, Tool support, Not-invented-here syndrome, Finding and Adapting components

拿房子比喻就是：COTS 是买房，Application Framework 是装配式房屋材料 + 工具箱，SPL 是设计一个房产开发模式。

怎么选？

开始（Start）

↓

是否有**现成的商业产品**可以满足大部分需求？（>80%功能，且可接受少量定制）

├─ 是 → 使用 ✅ COTS（Commercial Off-The-Shelf）

│

└─ 否

↓

项目是否是一个**单一产品**？（只有一个版本，不需要长期多个变体）

├─ 是

│ ↓

是否有明确的交付时间压力（如 < 6 个月）或者团队人力有限？

├─ 是 → 使用 ✅ Application Framework（开发速度快，结构清晰）

└─ 否 → 使用 Application Framework + 自定义模块（可拓展，后期好维护）

│

└─ 否

↓

项目是否需要支持**多个客户定制版**、多个产品变体，且有长期维护需求？

├─ 是 → 使用 ✅ SPL（Software Product Line）

└─ 否 → 考虑 Application Framework（先开发通用版本，未来可演化为 SPL）

Application Framework

Frameworks are like pre-built foundations. You save time from starting from the scratch by adding your customize features to the foundations. 别人写好的东西你拿来调或者调用

Pre-built structures (e.g. Django, Spring) that you extend,Build on a solid foundation

A generic structure (e.g. a set of classes) that you extend to create specific applications

Examples:

• Web framework: Django (Python), Spring (Java)

• Mobile framework: Flutter (cross-platform apps)

work: Provide common features (e.g., security, database support) that you customize.

Analogy: a house's basic infrastructure (plumbing, electricity) --- you design and add your details.

Extension methods

• Inheritance: Create subclasses from framework classes.

创建一个子类（Subclass）去继承框架类，从而重写或添加新的功能

• Callbacks: Register functions to be called at specific times or events.

你写一个函数，把它"传给"框架，告诉框架在特定时候（比如点击按钮、收到响应）调用它。

• Hooks : Insert custom code at predefined points.

框架提前留好"插口"（钩子函数），你在这些点里写入逻辑

Callback 是你告诉框架："事件来了就运行这个函数。"

Hook 是你告诉框架："状态变了或生命周期到了，就运行这段逻辑"，你不需要管何时触发，框架自动来做。

challenges

Learning Curve: Steep initial effort to understand and adapt the framework (e.g., mastering Django's structure). (上手难)

Overhead: May include unused features (e.g., extra libraries in React).

Application Frameworks vs Libraries

Application Frameworks:

• Structure for entire apps (e.g., Django).

• Controls flow---you extend it.

Software Libraries:

• Tools for specific tasks (e.g., Requests).

• You control flow---call as needed.

Software Product Lines (SPL)

A family of related software products sharing a common core but specialized for different needs. 一组相关的软件产品，共用核心部分，但针对不同需求做出定制。

Example: Salesforce offer CRM variant for sales, support, and marketing all built from one SPL

Benefits

• Reuse across multiple products

• Faster development for new variants

• Best for creating similar products with slight variations at the domain level.

Analogy: Like car models sharing the same engine but with different features.

challenges

Upfront Investment: High upfront cost and time to design a reusable core. (e.g., defining shared CRM features).
**Complexity:**Managing variants can get messy as the family grows (e.g., updating all apps consistently).

Commercial Off-The-Shelf (COTS)

Ready-made software you can buy and use without changing source code. (e.g. WordPress for websites) 现成的软件，买来即可使用，无需修改源码。

Example: WordPress for websites, HubSpot for CRM, SAP for ERP

Customization via:

• Configure settings

• Plugins

• API calls

Benefits :

• Fast deployment

• Lower development risk

• Vendor support

COTS vs SPL

In term of reusability, COTS is a generic software readily reusable by broad range of users while SPL requires certain degree of customization and composition using a range of reusable components.

Feature	COTS	SPL
Product Type	Single product for broad use	Family of products for specific domains
Customization	Limited	Highly customizable
Usage	Ready to use	Requires upfront development

Salesforce Case Study

Internally, Salesforce uses SPL to develop multiple products .

Externally, products are marketed as individual COTS products.

SPL products can be packaged and sold as COTS, but not always.

Testing

COTS测试主要保证软件和你的业务系统配合顺畅，避免集成带来的风险

• Key focus: integration testing to ensure COTS works with your business processes.

• COTS is pre-tested, but integration with you operation may not smooth

Example:

• Ensure the COTS CRM integrates well with your existing email system

• Ensure the COTS CRM works coherently with your process to manage customers

Challenges

Limited Customization: Configuration-only approach restricts flexibility (e.g., WordPress can be fully rewritten).

Vendor 供应商 Dependency: Reliant on vendor updates or support (e.g., delays in fixing SAP bugs).

Week 6 Critical Analysis

A systematic evaluation process to assess the strengths, weaknesses, and validity of software artifacts 一种结构化的评估方法，用于评估软件工件的优点、缺点和有效性。

Purpose :

Identify defects early
Improve system reliability
Optimize development processes

几种主流方法：

Failure Modes and Effects Analysis (FMEA): Identifies potential failures and their impact.
Root Cause Analysis (RCA): Traces problems back to their origin.
SWOT Analysis: Evaluates strengths, weaknesses, opportunities, threats.
Code Review: Systematic examination of source code.
Requirements Validation: Ensures requirements are clear, complete, and testable.

Failure Modes and Effects Analysis (FMEA)

Identifies potential failures and their impact. （"失效模式与影响分析 "，是一种**系统性、前瞻性的风险管理工具，**用来识别潜在的失败方式，并分析这些失败会对系统或产品造成什么影响。）

Proactive : Prevent problems before they occur.
Risk-Based : Identify and prioritize failure risks.
Structured Planning: Helps teams focus on the most critical failure points.

e.g You're building an online store. You predict that the payment system might fail (e.g., "Credit card doesn't process"). You rate how bad that'd be and plan to add extra security.

how it works

List system components/functions
Identify possible failure modes (e.g., "payment not processed")
Rate each failure on Risk Priority Number（RPN）:
- Severity (S) -- How bad is the impact?
- Occurrence (O) -- How likely is it to happen?
- Detection (D) -- How easily can it be detected?

Calculate RPN = S × O × D , then sort and prioritize.

Risk Priority Number（RPN）

Severity (S): How bad is the impact of the failure on the system, user, or business? (1 = minor, 10 = catastrophic)
Occurrence (O): How often does the failure happen? (1 = rare, 10 = frequent)
Detection (D): How likely are we to catch the failure before it reaches the user? (1 = always detected, 10 = undetectable)

RPN = S × O × D , then sort and prioritize .

Gathering Data

Collect data from a mix of historical records, team expertise, testing, user feedback, and industry

Severity : User Impact Analysis, Stakeholder Input, Industry Standards, Historical Data:
Occurrence (O): Log Analysis, Testing Data, Developer Estimates, Usage Patterns, Benchmarks
**Detection (D)：**Test Coverage, Monitoring Systems, User Reports, Design Reviews

Advanced Focus

Quantitative FMEA

Use probabilistic models (e.g., failure rates, MTTF).用量化指标取代主观评分

Replace subjective 1--10 scoring with data-driven metrics:

Severity: Based on impact (e.g., % downtime, revenue loss)
Occurrence: Derived from failure rates or MTTF
Detection: Based on test coverage (e.g., 80% → D = 2)
Formula : RPN = S × O × (1 - D%)

Mitigation Prioritization

Optimize resource allocation using cost-benefit analysis.

Cost-Benefit Model:
• Cost = Development effort (hours) + Tooling cost.
• Benefit = Risk reduction (ΔRPN × Impact).

Example:

• Fix "DB crash" (RPN 200, $10K loss) with 50-hour effort ($ 5K).

• ΔRPN = 150, Benefit = 150 × $10K =$ 1.5M, ROI = High.

修复前 RPN 是 200，修复后降到 50，差值 ΔRPN = 150，每次故障损失 1 万美元，这意味着修复带来的风险降低值是 150 × 1 万 = 150 万美元，修复花费 50 小时，开发成本 5 千美元，收益远大于成本，投资回报率（ROI）很高，值得优先修复。

Root Cause Analysis (RCA)

Traces problems back to their origin. 追溯问题根源，找出故障发生的"根本原因"。

RCA is reactive(事后反应型)---it's a detective tool to fix trouble after it happens.

Basic Method: 5 Whys**（五问法）**逐层追问"为什么"，直到找到最深层原因。不考就不管了

Fault Tree Analysis

FTA是一种自上而下的推理方法，用来识别和分析导致系统某个特定失败（称为"顶事件top event"）的所有可能原因。

A top-down, logical method to find causes of a system failure ("top event").

The top event (e.g., "website crashes") is at the root.
Branches show possible causes and conditions leading to it.
Visual, systematic, and uses probabilities.

Purpose:

Identify root causes (e.g., "server runs out of memory").
Assess risk by estimating failure probabilities.
Help prevent failures by focusing on critical causes.

Elements of the FTA diagram

Events

• Top event

• Intermediate event

• Basic event

• Undeveloped event

Logic Gates

• AND P(A)*P(B)

• OR P(A ∪ B) = 1 - (1 - P(A)) × (1 - P(B))

Failure Probabilities

Basic event

p(Server offline) = 0.01/hour

p(Auth failed) = 0.02/hour

p(Too many transactions) = 0.05/hour

p(No lock timeout) = 0.1

p(No validation) = 0.03

Intermediate event

p(Payment Gateway failed) = p(Server offline) OR p(Auth failed)

= 1 -- (1 -- 0.01) * (1 -- 0.02)

= 1 -- (0.99 * 0.98)

= 0.0298/hour
（Why not 0.01 + 0.02 = 0.03, instead 0.0298? 直接相加会重复计算两者同时失败的情况）

p(DB locks) = p(Too many transactions) AND p(No lock timeout)

= 0.05 * 0.01

= 0.005/hour

p(Input crash) = p(No input validation) OR (undeveloped event)

= 0.03

Top event

p(Checkout fails) = 1 -- (1 -- 0.0298) * (1 -- 0.005) * (1 -- 0.03)

= 1 -- (0.9702 * 0.995 * 0.97)

= 1 -- 0.9364 = 0.0636/hour (We explore further in the later section)

Dynamic Nature of Probability 动态概率

这些是**"每小时失败的概率"，也叫故障率**（Failure Rate）

格式通常是：0.01/hour（每小时有 1% 的失败机会）

• Probabilities specified as per hour (e.g., 0.01/hour) are failure rates---they describe how often an event happens over time.

These are typically used for:

• Dynamic Events: Things that occur randomly or intermittently, like hardware failures, network drops, or load spikes.

• Time-Based Systems: When we're analyzing a system's behavior over a period (e.g., an hour of checkout activity).

Mean Time To Failure (MTTF)*

In FTA, failure rates are tied to Mean Time To Failure (MTTF)
MTTF = 1 / failure rate

Example: "Gateway server offline" = 0.01/hour → MTTF = 1 / 0.01 = 100 hours. It fails once every 100 hours on average.

In software, "per hour" often comes from:

• Logs: Historical data (e.g., "Server crashed 5 times in 500 hours" →

0.01/hour).

• Metrics: Monitoring (e.g., "Traffic spikes 50 times in 1,000 hours" → 0.05/hour).

• Vendor Specs: Hardware or service uptime stats (e.g., 99.9% uptime → 0.001/hour downtime).

Static Nature of Probability 静态概率

这些是设计状态相关的概率，没有时间单位，表示某个缺陷"是否存在"。

格式通常是：0.1（10% 几率存在）

Probabilities without a time unit are static probabilities---they're not rates but single, dimensionless chances.

These apply to:
Constant Conditions: Design flaws, configuration errors, or missing features that don't "happen" over time---they just exist.
**State-Based Events:**The likelihood something is true at any given moment, not how often it occurs.

Mixed Nature of Probability*混合处理

静态概率 vs 动态概率怎么混合处理？

e.g Video Streaming App Freezes

Network disconnect: A time-based failure rate of 0.02/hour (2% chance per hour the network drops, based on ISP stats).

Outdated codec: A non-time-based probability of 0.1 (10% chance the app uses an outdated codec, from version audits).

Approach 1: Convert Static to Time-Based 强行转成动态概率

• If 10% of instances have an outdated codec, suppose it triggers a freeze 10% of the time under load.

• Assumption: Let's set it as0.1/hour---10% chance per hour the codec fails to decode, freezing playback.

p(Video playback freeze) = p(Network disconnected) OR p(Outdated codec)

= 1 -- [(1 -- 0.02) * (1 -- 0.1)]

= 1 -- (0.98 * 0.9)

= 1 -- 0.882

= 0.118/hour (MTTF 1/0.118 = 8.47 hours)

Approach 2: Treat Static as Conditional 分开建模 + 条件概率推导

更精确的方法是把"静态缺陷"视为一种条件：

In software, static conditions like this often need a trigger to cause failure.

• Static Probability : 0.1 is the chance the system has an outdated codec.

• Trigger Needed : It only freezes playback when paired with an event (e.g., playing an incompatible stream).

"Outdated codec" as a condition suggests it's not an independent hourly failure.

Approach 2:

• Network Disconnect: 0.02/hour

• Codec Failure:

• Incompatible Stream: 0.05/hour (hypothetical rate).

• Outdated Codec: 0.1 (10% chance the app has an old codec).*** 图上标错了，0.2 改成0.1***

• p(Codec Failure) = 0.05 * 0.1 = 0.005

p(Video playback freeze) = 1-(1-0.005)*(1-0.02)= 0.0249

MTTF = 1 / 0.025 = 40 hours

week8-week9

不考，我就不学（即答）

Week10 Open Source Development

Open-source software (OSS) is software whose source code is freely available under a license that allows anyone to study, modify, and distribute it.

This contrasts with proprietary software, where the source code is restricted and usually hidden.

Cathedral Model vs Bazzar Model

Cathedral Model

**Characteristics:

Closed & centralized** development by a small group.
- Limited access to source code until official release.
- Long release cycles with thorough internal testing.

**Advantages:

High control** over design and quality.
- Better security and consistency with fewer contributors.

**Limitation:

Lack of user feedback**, making it harder to adapt to changing needs.

Bazzar Model

**Characteristics:

Open & decentralized** development by the community.
- Large contributor base from diverse backgrounds.
- Rapid iteration with frequent updates and bug fixes.

**Advantages:

Diverse input** leads to innovation and flexibility.
- Faster bug fixing -- "Given enough eyeballs, all bugs are shallow."

**Limitation:

Harder quality control** due to many contributors and fast changes.

Despite these challenges, the Bazaar model powers major open-source successes like Linux, Python, and Apache, highlighting the strength of community collaboration.

Key Principles of Open Source Development***

The key principles represent the philosophy and values of the open source movement.

They shape how open source software is developed and create a unique culture within the open source community.

Meritocracy 任人唯贤 and community-driven development

meritocracy refers to the idea that individuals gain influence based on their contributions to the project. The more valuable contributions a person makes, the more they are recognized and respected within the community.

User feedback and iterative improvement

User feedback plays a crucial role in open source projects. It helps developers understand how the software is being used, what issues users are facing, and what features they would like to see in future updates.

Transparency and collaboration

Transparency in open source development means that the entire development process is open to public scrutiny. This includes the code base, issue tracking, discussions, and decision-making processes. Fosters a collaborative environment where developers can learn from each other, share knowledge and improve upon each other's work. It also helps in building trust among the development community and creates an inclusive and engaging community.

Licensing and legal considerations

Licenses define the legal parameters of how open-source software can be used, modified, and distributed. They protect the rights of both the developers and the users.

Case Studies of Open Source Projects

Successful examples

Linux Kernel: Started by Linus Torvalds in 1991, it grew from a solo project to a global collaboration, powering systems like Android and Ubuntu.
Python: Known for simplicity and power, Python has thrived thanks to its strong community and the Python Software Foundation.
Apache HTTP Server: Maintained by the Apache Foundation, it's one of the most widely used web servers, built through open, collaborative development.

Challenge

coordinating contributions from a distributed community,

ensuring the quality and security of the code,

sustaining momentum(Projects can fade due to burnout, stagnation, or competition) over time

Solution

Community is vital: Active contributors keep projects alive and evolving.

Clear governance matters: Defined roles and processes help guide contributions effectively.

The social and cultural aspects of open source development involve the community that forms around the project. This includes social norms ,communication patterns , decision-making processes , conflict resolution mechanisms , and recognition systems.

Community Building

A strong open-source project relies on its community of users, developers, and contributors. Fostering respect, inclusivity, and collaboration helps the project grow and evolve.

Communication and Collaboration

Open-source projects thrive on clear, online communication through forums, chat, and issue trackers, enabling global collaboration across cultures and time zones.

Conflict Resolution

It's important to have established processes for conflict resolution that are fair, transparent, and promote a productive dialogue (public discussions, votes, or decisions by a designated authority like a project leader or a steering committee.).

Recognition and Reputation

Recognition and reputation are major motivators in open source communities. Contributors earn recognition for their contributions, and this can enhance their reputation within the community, potentially leading to career opportunities.

Learning and Mentorship

Open source projects provide a platform for continuous learning and mentorship.

Newcomers to the project learn from more experienced contributors, not just about code, but also about software development practices, teamwork, and problem-solving.

Diversity and Inclusion

Embracing diversity and inclusion brings richer ideas and solutions, making open-source communities more innovative and welcoming.

Week 11-13 Testing

Software Testing

Software testing verifies whether a software meets its requirements and works as expected.

Testing can find defects but cannot prove the software is completely correct.

Why Testing?

To find more problems (Meyer: testing is finding errors)

Testing never fully ends

To ensure product quality

To check if product meets requirements

Test Stopping Criteria

• Meet deadline, exhaust budget, ... (management decision)

• Achieved desired coverage（覆盖范围）

• Achieved desired level of failure rate

Testing Activities

Identify - design - build - execute - compare

特性 / 类型	黑盒测试（Black Box）	白盒测试（White Box）	集成测试（Integration Testing）
是否了解代码	否	是	可是也可否（取决于策略）
测试关注点	功能是否符合预期	代码逻辑是否正确	模块间协作是否正常
应用阶段	功能测试、系统测试	单元测试	单元测试之后，系统测试之前
常用测试方法	等价类划分、边界值分析等	路径覆盖、分支覆盖、语句覆盖等	接口调用验证、数据传递正确性等

Week11 Black box/ Functional Testing testing

Black-box testing is a software testing method where the system is evaluated solely based on its external behavior, without access to internal code or structure .

The tester focuses on inputs and expected outputs to validate whether the software behaves according to its specifications.

是一种只依据软件外部行为进行验证的测试方法，不涉及内部代码结构。

测试者关注输入和输出，判断系统是否满足需求和预期功能。

input → blackbox → output

If the output matches expectations for a given input, the test passes.

Advantages
Implementation-Independent: Test cases remain valid even if the code changes.
Parallel Development: Tests can be designed alongside implementation, speeding up the development process.

Disadvantages
Redundant Test Cases : Overlap may occur among test inputs.
Gaps in Coverage: Some important input conditions may remain untested.

Goal

Since it's impractical to test all possible inputs , black-box testing focuses on reducing test cases using:

• Divide input conditions into equivalence classes

• Choose test cases for each equivalence class.

(Example: If an object is supposed to accept a negative number, testing one negative number is enough)

Boundary Value Testing (BVT)

核心思想 ：错误最常出现在输入边界附近。
测试重点 ：边界值本身和边界附近的值。

A program is viewed as a function:

Inputs → form the domain

Outputs → form the range

Boundary Value Analysis (BVA) is a key functional testing technique.

Functional testing traditionally focuses on input domain , butoutput-based cases can also provide value.

Key idea:

targets the edges (boundaries) of input ranges.
Rationale: Bugs often occur at or near boundary values (e.g., min/max limits).
Particularly effective in weakly typed languages where implicit type conversions may occur.\

Characteristics

May generate more test cases than domain testing or equivalence class testing.

Test coverage may be lower , but automation is easier due to the method's simplicity.

Valid Input for Program P

boundary inequalities of n input variables define an n-dimensional input space (domain)

For a program f(y₁, y₂) where a ≤ y₁ ≤ b and c ≤ y₂ ≤ d,

the input space is a 2D region bounded by these inequalities.

类型	每个变量的测试值	是否组合	总用例数（n个变量）
Normal BVT	min, min+1, max-1, max （4个）	❌（单变量变化）	`4 × n + 1(`nominal `)`
Worst-case BVT	min, min+1, nominal, max-1, max （5个）	✅（全组合）	`5^n`
Robust BVT	min-1, min, min+1, max-1, max, max+1（6个）	❌（单变量变化）	`6 × n + 1(`nominal `)`
Robust Worst-case BVT	min-1, min, min+1, nominal, max-1, max, max+1（7个）	✅（全组合）	`7^n`

Normal BVT

The basic idea in boundary value analysis is to select input variable values at their:

Minimum (min)
Just above minimum (min+)
Nominal (nom = (min + max) / 2)
Just below maximum (max-)
Maximum (max)

Single fault assumption 单一故障假设: Failures usually caused by one variable's fault, not multiple faults simultaneously.

Test case design:
- Hold all variables at nominal except one, which takes the 5 boundary values.
- Repeat for each variable.
- For n variables, total test cases = 4n + 1.

Generalizing

Boundary value analysis identifies important input/output values for testing.

Can be generalized by:

Number of variables : n variables → 4n + 1 tests

Variable ranges :

Bounded discrete: use 5 key values (min, min+, nom, max-, max)

Unbounded discrete: set artificial bounds

Logical variables (e.g. booleans): only true/false, limited boundary value application

Limitations

Works best for functions with multiple independent variables representing bounded physical quantities (e.g., temperature, pressure).
Does not consider program logic or variable semantics---only boundary values .
Effective for physical variables, less so for logical or symbolic variables (e.g., PIN codes, phone numbers).

Worst-case BVT

Core idea: Rejects the single fault assumption and considers multiple variables simultaneously at extreme values.

Test design:

Uses the 5 boundary values per variable from normal BVT.
Takes the Cartesian product for 2, 3, ... n variables.
For n variables, total test cases = 5^n.

Best applied when:

Physical variables have many interactions,
And program failure is costly.

Robust BVT

Robustness testing is a simple extension of boundary value analysis.

In addition five boundary value analysis , add values:

a value slightly greater that the maximum (max+)

a value slightly less than the minimum (min-).

for a function of n variables, there will be 6n + 1 unique test cases.

main value: force attention on exception handling.

language:

• In some strongly typed languages values beyond the predefined range will cause a run-time error.

• It is a choice of using a weak typed language with exception handling or a strongly typed language with explicit logic to handle out of range values.

Robust worst-case BVT

Extends worst-case testing by including slightly out-of-bound values (min- and max+) in addition to the five boundary values.

• Uses the Cartesian product of 7 values per variable (from robustness testing).

• For n variables, total test cases =7^n

e.g triangle problem

• Problem Statement

• Input: 3 integers (sides of a triangle)

• Output: Type of Triangle (Equilateral, Isosceles, Scalene or NotATriangle)

• Extended Version: Additional Output Type: Right Triangle

• In the problem statement, no conditions are speciﬁed on the triangle sides, other than being integers.

• Obviously, the lower bounds of the ranges are all 1.

• We arbitrarily take 200 as an upper bound.

• For each side, the test values are {1, 2, 100, 199, 200}.

• Robust boundary value test cases will add {0, 201}.

• The table on the next page contains Normal Boundary value test cases using these ranges.

• Test cases 3, 8, and 13 are identical -- Redundant

• There is no test case for scalene triangles

e.g The NextDate Function

鈥?The function takes 3 variables, month, the day, and the year. It return the next day of the input.

鈥?We could encode these, so that January would correspond to 1, February to 2, and so on.

鈥?In this example, we use Worst-case boundary value testing. All 125 worst-case test cases for NextDate are listed in the table on next page.

鈥?Examine the following: -

鈥?Gaps of untested functionality

鈥?Redundant testing

鈥?Questions: -

鈥?Would anyone actually want to test January 1 in 铿乿e di铿€erent years?

鈥?Is the end of February tested su铿僣iently?

...

Equivalence Class Testing

核心思想 ：将输入分为有效和无效的等价类 ，各类中只选一个代表值测试。
测试重点 ：减少重复测试，覆盖所有输入类型。

**Goal:**Divide input data into equivalence classes where each class represents a set of inputs that are treated similarly by the system. Test just one representative value from each class.

Benefits:

Reduces redundant test cases

Ensures representative coverage of input types

Definition and Key Properties

An equivalence class is a subset of input values such that:

All values in the same class are expected to produce similar behavior.

Classes are mutually disjoint and their union covers the entire input domain.

Two important implications for testing:
Completeness 完整性**:** Every possible input is covered by at least one class.
Non-redundancy 非冗余性**:**No input belongs to more than one class.

vs BVT:

BVT 假设输入变量相互独立（这可能不成立）

BVT 可能导致测试用例更多但覆盖率更差

相比之下，等价类测试更注重逻辑分区和代表性选择

Core Ideas

Divide input/output domains into groups (equivalence classes); test one representative from each.

Typically applied to input domain , but ideally used for both input and output .

Include robust classes (invalid inputs) for stronger coverage.

Combine with Boundary Value Analysis and Worst-Case Testing (multiple faults).

semario
Large Input Domains : Reduces test cases by grouping similar inputs.
Boundary Conditions : Helps find errors near class limits.
Avoid Redundancy : One test per class avoids repeated tests.
Well-Defined Inputs : Best for systems with clear input rules (e.g., age, dropdowns).
Early Bug Detection: Speeds up testing and catches major issues early.

Reminder: Equivalence class testing finds many issues, but not all. Use it alongside other techniques for full coverage.

Output

Equivalence class partitioning can be applied also to the output domain of the software under test.

Represent the different responses or behaviors that the software may exhibit based on its input.

Verify whether the software behaves as expected for each class of outputs, which can be more efficient than testing every individual output.

Approach

Choose one test case from each equivalence class.
The crucial part is defining the equivalence relation that forms these classes.
Requires good understanding of the input domain, often beyond just interface specs.
Based on specifications, not code knowledge.
Must consider dependencies between inputs.

Testing for 2-variable function

Consider a function f(x1,x2) where the values of x1 and x2 are defined to be

a <= x1 <= b and c <= x2 <= d

Assume the following equivalence classes for x1

• {a, a+1, ..., ta}, {ta+1, ta+2, ..., tb}, {tb+1, tb+2, ..., b}

Assume the following equivalence classes for x2

• {c, c+1, ..., tc}, {tc+1, tc+2, ..., td}, {td+1, td+2, ..., d}

Testing only requires picking one value from each class.

Test Data selection for 2-variable function

Choose one representative value from each equivalence class

Total test cases = n × m

n = number of x1 partitions

m = number of x2 partitions

Robustness Testing

引入无效输入类（输入范围之外）以测试程序容错性。

Extended equivalence classes for x1:

{values < a}, {a, a+1, ..., ta}, {ta+1, ta+2, ..., tb}, {tb+1, tb+2, ..., b}, {values > b}

Extended equivalence classes for x2

{values < c}, {c, c+1, ..., tc}, {tc+1, tc+2, ..., td}, {td+1, td+2, ..., d}, {values > d}

策略	是否测试无效输入	是否多无效组合	公式
Weak Robust	✅（逐个）	❌（最多一个）	`1 + ∑ 各变量的无效类数量`
Strong Robust	✅	✅（任意组合）	`(有效 + 无效)^n` 笛卡尔积

Weak Normal

Goal : Cover each valid equivalence class at least once.
Strategy : Do not test all combinations --- choose the minimum number of test cases.
Test Case Count : Equal to the largest number of partitions among all input variables.

• Variable x1

• V1 = {a, a+1, ..., ta}

• V2 = {ta+1, ta+2, ..., tb}

• V3 = {tb+1, tb+2, ..., b}

• Variable x2

• V4 = {c, c+1, ..., tc}

• V5 = {tc+1, tc+2, ..., td}

• V6 = {td+1, td+2, ..., d}

e.g

• Consider a basic system that accepts two integer inputs, X and Y, and performs a division operation X/Y.

• Here are some equivalence classes for both X and Y:

• Variable X: {values <= -1}, 0, {values >= 1}

• Variable Y (denominator分母): {values <= -1}, 0 (invalid), {values >= 1}

• Weak Normal Equivalence Class test cases

• (X, Y) = (-3, -5), (0, 8), (12, 2)

* Y = -2 虽然从等价类角度是合法的，但由于除法一般要求分母是正数，这类情况是不能算作正常输入的。

Strong Normal

Goal : Test all combinations of valid equivalence classes.
Strategy : Use the Cartesian product of all valid subsets.
Test Case Count : n × m where n and m are the number of partitions in each input.

Equivalence Classes:

• Variable x1

• V1 = {a, a+1, ..., ta}

• V2 = {ta+1, ta+2, ..., tb}

• V3 = {tb+1, tb+2, ..., b}

• Variable x2

• V4 = {c, c+1, ..., tc}

• V5 = {tc+1, tc+2, ..., td}

• V6 = {td+1, td+2, ..., d}

• Test cases

• <V1, V4>, <V1, V5>, <V1, V6>

• <V2, V4>, <V2, V5>, <V2, V6>

• <V3, V4>, <V3, V5>, <V3, V6>

Weak Robust

Goal: Cover each equivalence class (including invalid/out-of-range classes), but not all combinations.
Strategy: Add 1 test case per out-of-range class.
Test Case Count: Still minimal, just enough to cover every class.

Extended Equivalence Classes:

Extended equivalence classes for x1

{values < a}, {a, a+1, ..., ta}, {ta+1, ta+2, ..., tb}, {tb+1, tb+2, ..., b}, {values > b}

Extended equivalence classes for x2

{values < c}, {c, c+1, ..., tc}, {tc+1, tc+2, ..., td}, {td+1, td+2, ..., d}, {values > d}

Strong Robust

Goal : Fully test all combinations of both valid and invalid equivalence classes.
Strategy : Take Cartesian product of all valid + invalid classes.
Test Case Count: Very high --- exponential in number of classes.

Decision Table Based Testing

核心思想 ：基于规则组合设计测试用例，穷举所有条件组合，通过合并类似情况或使用"无关"值简化规则，实现测试全面且高效。
测试重点 ：多条件组合下的决策逻辑。

Design test cases based on combinations of rules.

Focus on logical relationships between inputs (conditions) and outputs (actions).

Key Points

Precisely and compactly models complex logic.

Like if-then-else or switch-case but can elegantly handle multiple independent conditions and actions.

Ensures all condition combinations are covered.

Useful for specifying complex program logic.

Usage

Specs are given or can be converted to decision tables.

Predicate evaluation order does not affect outcome.

Only one rule applies per situation.

Action execution order within a rule is irrelevant.

Structure

Conditions 条件 : Variables or predicates with possible values.
Condition Entries 条件取值**:**
- Boolean (True/False) --- Limited Entry Tables.
- Multiple values --- Extended Entry Tables.
- "Don't care" values simplify rules.
Actions 动作**:** Procedures to perform.
Action Entries 动作条目**:** Indicate if the action is executed under the rule.

Before using the tables, ensure:
Completeness 完整性**:** All condition combinations explicitly listed, including defaults.
Consistency一致性: Each condition combination maps to a unique action or set of actions.

Development Methodology

Determine conditions and values
Determine maximum number of rules
Determine all possible actions
Encode possible rules
Encode the appropriate actions for each rule
Verify the policy
Simplify the rules (reduce the number of columns whenever possible)

Observation

Best for programs with prominent if-then-else logic and complex input-output relationships.
Suitable when input subsets affect computations or cause-effect relations exist.
Does not scale well to very large sets of conditions .
Can be refined iteratively.

Week12 White box testing

Also known as : Glass Box, Structural, Clear Box, Open Box Testing
Core idea: A software testing technique whereby explicit knowledge of the internal workings of the item being tested are used to select the test data.

Code-aware : Tester knows the source code, logic, and data structures.
Compared to Black Box : Focuses on how it works, not just the output.
Uses Math Models : Graphs, trees, and matrices help analyze control flow.
Collects Metrics: Statement coverage, path coverage, and (Cyclomatic) complexity.
Common Uses: Unit testing, Security testing, Code optimization & quality checks

Program/ Control Flow Graph

A directed graph representing the control flow of a program.
Nodes = statement fragments/ program statement
Edges = possible execution transitions (i.e., flow of control)

If i and j are nodes in the program graph, an edge exists from node i to node j if and only if the statement fragment corresponding to node j can be executed immediately after the statement fragment corresponding to node i.

概念名称	是什么	用途	相互关系
DD-Paths	分支之间的直线路径段	简化路径建模	基础路径分析常基于 DD-Path 构建
覆盖率指标	衡量测试完整性的度量工具	判断测试是否够"全面"	适用于所有测试方法（包括基础路径测试）
基础路径测试	基于控制流图设计的测试策略	覆盖所有独立逻辑路径	使用控制流图、DD-Path 和复杂度分析

DD-Paths 决策到决策路径

程序控制流图中，从一个决策节点到另一个决策节点（或终止节点）之间的最大路径段。

A DD-Path is a commonly used structural testing construct, representing a sub-path in the control flow graph that runs from one decision node to another or to a terminal node.

nodes: DD-paths of its program graph
**edges:**represent control ﬂow between successor DD-paths.

More formally a DD-Path is a chain obtained from a program graph such that:

• Case1 起始节点: a single node with indeg(入度)=0.

• Case2 终止节点: a single node with outdeg(出度)=0,

• Case3 有分支: a single node with indeg ≥ 2 or outdeg ≥ 2 (for example, node D and A)

• Case4 中间普通节点: a single node with indeg =1, and outdeg = 1

• Case5: it is a maximal chain of length ≥ 1

构图方式: 将 CFG（控制流图）中的语句组转换为 DD-Path 节点，边表示路径间的连接

Test Coverage Metrics 盖率指标

• The motivation of using DD-paths is that they enable very precise descriptions of test coverage.

Specification-based testing may miss redundant 冗余 or untested code paths 遗漏路径 .
Coverage metrics measure how much of the code is actually exercised by test cases.

Two Key Coverage Types
1. Statement Coverage 语句覆盖

Ensure all individual statements in the program are executed at least once.

2. Predicate (Branch/Decision) Coverage 判定/分支覆盖

Ensure every predicate (Boolean condition) is tested for both True andFalse outcomes.

For if (A or B) then C, good test cases include:

A = True, B = False → evaluates to True

A = False, B = False → evaluates to False

This type of coverage aims totraverse all edges in the DD-Path graph.

DD-Path Graph Edge Coverage 图边覆盖

要求：测试用例需要遍历 DD-Path 图中的每条边。

本质上等价于判定覆盖（Predicate / Decision Coverage）。

Here a T,T and F,F combination will suffice to have DD-Path Graph edge coverage or Predicate coverage C1

C1P

DD-Path Coverage Testing, C1 的加强版

要求：考虑多个判断条件时，所有可能的组合都要覆盖。consider test cases that exercise all possible outcomes

This is the same as the C1 but now we must test T,T, T,F, F,T, F,F for the predicates P1, and P2 respectively.

Multiple Condition Coverage Testing多条件覆盖测试

适用于一个判断条件是复合表达式的情况，例如：if (A or B)

requires that each possible combination of inputs be tested for each decision .

• Example: "if (A or B)" requires 4 test cases:

• A = True, B = True

• A = True, B = False

• A = False, B = True

• A = False, B = False
**problem:**For n conditions, 2n test cases are needed, and this grows exponentially with n （测试量指数增长）

Dependent DD-Path Pairs Coverage Testing 依赖 DD-Path 对覆盖

Builds on C1 by requiring the coverage of data-dependent path pairs.

Two DD-Paths aredependent if a variable is defined in one path and used in another.

Goal: Cover all edges and all such dependent pairs.
Importance:

Detects data flow errors. 更接近实际语义错误检测

Helps identify infeasible paths 不可行路径 and subtle logic conflicts.

We have good examples of dependent pairs of DD-paths: in slide 16, C and H are such a pair, as are DD-paths D and H.

Loop Coverage

用来测试循环语句的

Tests loop behavior by exercising following conditions:

Skipping the loop (0 times),

Executing once,

Nominal/typical number of iterations,

Near-boundary and maximum iterations,

Even robustness/overflow cases.

Nested Loops 嵌套循环**:** Start testing from the innermost loop.
Knotted Loops交叉循环 / 打结循环(with shared variables): Require data flow analysis.

Concatenated Loop, Nested Loop, and Knotted Loop

e.g

• Statement Coverage C0:

• SCPath1: 1-2-3(F)-10(F)-11-13

• SCPath2: 1-2-3(T)-4(T)-5-6(T)-7(T)-8-9-3(F)-10(T)-12-13

• Branch or Decision Coverage C1:

• BCPath1: 1-2-3(F)-10(F)-11-13

• BCPath2: 1-2-3(T)-4(T)-5-6(T)-7(T)-8-9-3(F)-10(T)-12-13

• BCPath3: 1-2-3(T)-4(F)-10(F)-11-13

• BCPath4: 1-2-3(T)-4(T)-5-6(F)-9-3(F)-10(F)-11-13

• BCPath5: 1-2-3(T)-4(T)-5-6(T)-7(F)-9-3(F)-10(F)-11-13

Basis Path Testing

计算程序的 圈复杂度（Cyclomatic Complexity） 来衡量程序的逻辑复杂度，并据此找出一组 线性独立路径，作为测试的基础路径集合，从而设计高效、覆盖全面的测试用例。

Basis Path Testing is a white-box testing method proposed by Tom McCabe , which uses cyclomatic complexity to identify a basis set of independent execution paths through a program. These paths guide the creation of effective test cases.

Cyclomatic Complexity

Cyclomatic complexity (V(G)) measures the program's decision structure complexity."

It represents the number oflinearly independent paths in the program.

Step1: Compute the Program Graph

Build a Control Flow Graph (CFG), typically represented as a DD-Path graph with a single entry (A) and single exit (G).

Step2: Calculate Cyclomatic Complexity

Use McCabe's formula to determine the number of independent paths:

V(G) = e - n + 2p (一般用法 a single connected component)

V(G)=e−n+p（用于环路个数计算）

e: number of edges

n: number of nodes

p: number of connected components 图中互不相连的子图个数 (usually 1)

The number of linearly independent paths from source node to sink node in the graph is

• V(G) = e -- n + 2p = 10 -- 7 + 2(1) = 5

The number of linearly independent circuits in the graph is

• V(G) = e -- n + p = 11 -- 7 + 1 = 5

Step3: Select a Basis Set of Paths

A linearly independent path is any path through the source code that introduces at least one new set of processing statements or a new condition.

Choose paths that each add at least one new edge or decision. Start with a baseline path, then flip decisions one by one to find all independent paths.

n(Basis Paths) = Cyclomatic Complexity

Step4: Generate Test Cases

Design one test case per independent path to ensure complete logical coverage of the program.

observation

• Two major soft spots occur in McCabe's view:

• Testing the set of basis paths is sufficient (it is not!)

• Have to make program paths look like a vector space.

• V(G)=e-n+2p =18-15+2(1) =5

• Original p1: A-B-C-E-F-H-J-K-M-N-O-Last (Scalene)

• Flip p1 at B - p2: A-B-D-E-F-H-J-K-M-N-O-Last (Infeasible)

• Flip p1 at F - p3: A-B-C-E-F-G-O-Last (Infeasible)

• Flip p1 at H - p4: A-B-C-E-F-H-I-N-O-Last (Equilateral)

• Flip p1 at J - p5:A-B-C-E-F-H-J-L-M-N-O-Last (Isosceles)

McCabe's procedure successfully identifies basis path that are topologically independent, but when these contradict semantic dependences, topologically possible paths are seen to be logically infeasible.

• If we assume that the logic of the program dictates that "if node C is traversed, then we must traverse node H, and if node D is traversed, then we must traverse Node G"

• These constraints will eliminate Paths 2,3

• We also need a basis path for the NotATriangle case

• We are left to consider four feasible paths:

• p1: A-B-C-E-F-H-J-K-M-N-O-Last (Scalene)

• p4: A-B-C-E-F-H-I-N-O-Last (Equilateral)

• p5:A-B-C-E-F-H-J-L-M-N-O-Last (Isosceles)

• p6: A-B-D-E-F-G-O-Last (Not a triangle)

Guidelines and Observations

Functional testing focuses on external behavior, often disconnected from the internal code.

Pathtesting emphasizes program structure (graphs), not logic, but helps evaluate test coverage quality.

Basis path testing offers a lower bound on the amount of testing needed.

Path testing provides cross-checks for functional testing:
- If multiple test cases cover the same path , there may be redundancy.
- If some DD-paths are never covered , functional testing may have gaps.
Coverage metrics help:
- Set minimum testing standards
- Identify code segments needing more rigorous testing

Week13 Integration testing

欸~不考我就不学

CPT304-2425-S2-Software Engineering II

Week1 Software Crisis

No silver Bullet

Essential difficulties*

Accidents偶发性困难

Promising Attacks

Contemporary technologie solution**

Week 2 Object-oriented Concepts

Cohesion ＆ Coupling

SOLID Principles

S -- Single Responsibility Principle*

O -- Open/closed Principle*

L -- Liskov Substitute Principle

I -- Interface Segregation Principle

D -- Dependency inversion Principle

Composition

Week 3-4 Design Pattern

Creational patterns

The Factory Pattern*

Structural patterns

Decorator Pattern*

Behavioral patterns

State.*

Strategy. *

Week 5 Software Reuse

Application Framework

Extension methods

challenges

Application Frameworks vs Libraries

Software Product Lines (SPL)

Benefits

challenges

Commercial Off-The-Shelf (COTS)

COTS vs SPL

Testing

Challenges

Week 6 Critical Analysis

Failure Modes and Effects Analysis (FMEA)

Risk Priority Number（RPN）

Advanced Focus

Root Cause Analysis (RCA)

Fault Tree Analysis

Dynamic Nature of Probability 动态概率

Mean Time To Failure (MTTF)*

Static Nature of Probability 静态概率

Mixed Nature of Probability*混合处理

week8-week9

Week10 Open Source Development

Cathedral Model vs Bazzar Model

Key Principles of Open Source Development***

Case Studies of Open Source Projects

The Social and Cultural Aspects of Open Source Development

Week 11-13 Testing

Week11 Black box/ Functional Testing testing

Boundary Value Testing (BVT)

Normal BVT

Worst-case BVT

Robust BVT

Robust worst-case BVT

e.g triangle problem

e.g The NextDate Function

Equivalence Class Testing

Approach

Weak Normal

Strong Normal

Weak Robust

Strong Robust

Decision Table Based Testing

Structure

Week12 White box testing

Program/ Control Flow Graph

DD-Paths 决策到决策路径

Test Coverage Metrics 盖率指标

Basis Path Testing

Cyclomatic Complexity

Guidelines and Observations

Week13 Integration testing