Learning Resources

Test data

Test Data is data which has been specifically identified for use in tests, typically of a computer program.

Some data may be used in a confirmatory way, typically to verify that a given set of input to a given function produces some expected result. Other data may be used in order to challenge the ability of the program to respond to unusual, extreme, exceptional, or unexpected input.

Test data may be produced in a focused or systematic way (as is typically the case in domain testing), or by using other, less-focused approaches (as is typically the case in high-volume randomized automated tests). Test data may be produced by the tester, or by a program or function that aids the tester. Test data may be recorded for re-use, or used once and then forgotten.

Some test data is used to confirm the expected result, i.e. When test data is entered the expected result should come and some test data is used to verify the software behavior to invalid input data.

Test data is generated by testers or by automation tools which support testing. Most of the times in regression testing the test data is re-used, it is always a good practice to verify the test data before re-using it in any kind of test.
Depending on your testing environment you may need to CREATE Test Data (Most of the times)or atleast identify a suitable test data for your test cases (is the test data is already created).

Typically test data is created in-sync with the test case it is intended to be used for.

Test Data can be Generated -

- Manually
- Mass copy of data from production to testing environment
- Mass copy of test data from legacy client systems
- Automated Test Data Generation Tools

Typically test data should be generated before you begin test execution since in many testing environments creating test data takes  many pre-steps or test environment configurations  which is  very time consuming. If test data generation is done while you are in test execution phase you may exceed your testing deadline.

Test Data for White Box Testing
In white box testing, test data is derived from direct examination of the code to be tested. Test data may be selected by taking into account the following things:

- It is desirable to cover as many branches as possible; testing data can be generated such that all branches in the program source code are tested at least once
- Path testing: all paths in the program source code are tested at least once - test data can be designed to cover as many cases as possible
- Negative API testing:
- Testing data may contain invalid parameter types used to call different methods
- Testing data may consist in invalid combination's of arguments which are used to call the program's methods

Test Data for Performance Testing
Performance testing is the type of testing which is performed in order to determine how fast  system responds under a particular workload. The goal of this type of testing is not to find bugs, but to eliminate bottlenecks.  An important aspect of performance testing is that the set of test data used  must be very close to 'real' or 'live' data which is used on production. The following question arises: ‘Ok, it’s good to test with real data, but how do I obtain this data?’ The answer is pretty straightforward: from the people who know the best – the customers. They may be able to provide some data they already have or, if they don’t have an existing set of data, they may   help you by giving feedback regarding how the real-world data might look like.In case you are in a maintenance testing project you could copy data from the production environment into the testing bed. It is a good practice to anonymize (scramble) sensitive customer data like Social Security Number , Credit Card Numbers , Bank Details etc while  the copy is made.

Test Data for Security Testing
Security testing is the process that determines if an information system protects data from malicious intent. The set of data that need to be designed in order to fully test a software security must cover the following topics:

- Confidentiality:All the information provided by clients is held in the strictest confidence and is not shared with any outside parties. As a short example, if an application uses SSL, you can design a set of test data which verifies that the encryption is done correctly.
- Integrity: Determine that the information provided by the system is correct. To design suitable test data you can start by taking an in depth look at the design, code, databases and file structures.
- Authentication: Represents the process of establishing the identity of a user. Testing data can be designed as different combination of usernames and passwords and its purpose is to check that only the authorized people are able to access the software system.
- Authorization: Tells what are the rights of a specific user. Testing data may contain different combination of users, roles and operations in order to check only users with sufficient privileges are able to perform a particular operation.

Test Data for Black Box Testing
In Black Box Testing the code is not visible to the tester . Your functional test cases can have test data meeting following criteria -

- No data: Check system response when no data is submitted
- Valid data : Check system response when Valid  test data is submitted
- Invalid data :Check system response when InValid  test data is submitted
- Illegal data format: Check system response when test data is in invalid format
- Boundary Condition Data set: Test data meeting bounding value conditions
- Equivalence Partition Data Set : Test data qualifying your equivalence partitions.
- Decision Table Data Set: Test data qualifying your decision table testing strategy
- State Transition Test Data Set: Test data meeting your state transition testing strategy
- Use Case Test Data: Test Data in-sync with your use cases.

Automated Test Data Generation
In order to generate various sets of data, you can use a gamut of  automated test data generation tools. Below are some examples of such tools for

- Complete application testing by inflating a database with meaningful data
- Create industry-specific data that can be used for a demonstration
- Protect data privacy by creating a clone of the existing data and masking confidential values
- Accelerate the development cycle by simplifying testing and prototyping