Pentest FAQ

The comprehensive Application Security Penetration Test FAQ for clients

Here you will find everything you need to know before, during, and after commissioning an application security penetration test – conveniently online or as a PDF for download:

Download Pentest-FAQ DE (PDF)

Download Pentest-FAQ EN (PDF)

Before getting started

0. Introduction

The penetration test – or pentest for short – is still the most popular way to uncover security flaws in web applications. The prevalence of this method stems primarily from the fact that it can be set up without great effort and carried out by a corresponding expert or a company specializing in it.
‍
This simplicity slightly obscures the fact that the success of a pentest depends on a number of factors that need to be taken into account. A pentest is successful when it leads to maximum coverage while minimizing effort. Question 12 addresses why the task cannot be defined as finding all vulnerabilities.
‍
If optimal framework conditions are not guaranteed, there is a risk of lulling oneself into a false sense of security (Question 5): One falsely assumes that the application is secure – except for the vulnerabilities that may have been found – and becomes more negligent in the effort to achieve comprehensive security.
A significant contribution to counteracting this lies in the knowledge of the pitfalls of commissioning a pentest and avoiding them: From enabling the pentester to correctly assess the scope, to ensuring a smooth process, to dealing correctly with the results of the test.

This FAQ addresses the many questions that arise around penetration testing of web applications – and provides practical answers. Plus a wealth of practical experience and some deeper insights and views from a long pentester life. It is aimed at anyone who wants to subject their web application to such a review. Such a document, focused on a single topic, gives the impression that the pentest is the sole or best means of establishing security in web applications. At least the former would be clearly wrong, and the latter is also incorrect in many cases (see Question 36 and Question 37). If you want to award a superlative, then it is that the pentest is the least dispensable means.

Do you have questions or feedback?
We are constantly adapting this FAQ. The goal is to inform you as the client as comprehensively as possible about all aspects of penetration testing that concern you.

Please send us your questions and suggestions for improvement. We will provide a direct response and incorporate them into the FAQ: office@mgm-sp.com

Fundamentals

1. What terms should I know?

From now on, we will simply call the penetration tester the tester and the person who threatens security the attacker. We call the user of the application, who is often the victim of an attack, the user. We uniformly call a security gap, a security problem, or an attackability a vulnerability, and a discovered vulnerability a finding. We refer to a system in which a security incident has occurred as compromised, and the incident itself as a compromise.

We use the term hazard potential to express the dangerousness of a vulnerability – synonyms: criticality, threat, or risk. Within the scope of this document, the term risk is reserved for its technical definition, namely the product of the probability that the undesirable will occur (probability of occurrence) and the (maximum) amount of damage resulting from it (amount of damage).

An important concept in terms of risk is that of risk management. This term expresses something very essential, namely that the goal of all security efforts is usually not to achieve absolute security, but rather to bring about the right balance between the negative effects of security measures (costs, delays, restrictions, etc.) and the negative effects of too little security.

2. What exactly is a vulnerability?

A vulnerability is any characteristic of an application that can be assumed to be unintended and that can be misused by a third party – i.e., exploitable in a way that causes damage to the operator of the application. This also includes indirect damage, for example, if the direct victims are the users.

3. Why are custom-developed web applications a serious security threat?

Web applications possess a number of characteristics that make them a serious security risk.

Custom-developed web applications are unique. As such, they have not undergone an industrial maturation process like off-the-shelf applications, where quality-enhancing and error-eliminating effects typically occur simply due to the large number of units. The probability of security-relevant errors is therefore very high per se.
Developing an application with high-quality standards is significantly more expensive than programming the same application without these standards. Nevertheless, at first glance, one will not notice any difference between the inexpensive and the expensive version after completion. This only becomes apparent over time in terms of classic software quality metrics, including high costs and high susceptibility to errors during further development. Regarding security, the difference is expressed in a high risk of damage due to the presence of vulnerabilities. The budget provided for a software project often does not meet these requirements, which is reflected in quality and security deficiencies.
The complexity inherent in an application is often underestimated. This is probably because the internal structure is invisible, the many internal dependencies do not appear externally, and there is no "material consumption" during the software manufacturing process from which the size of the task could be read. However, complexity always means a high probability of errors.
The level of knowledge on security topics among many software engineers is still quite low.
Many application frameworks do not address the topic of security comprehensively enough. If security measures that could be firmly programmed into the respective framework were also implemented there by the framework developers, the programmer would have far fewer opportunities to program anything insecure at all.
Very often, web applications have a connection to sensitive data or have direct access to it. Even if these – mostly well-protected at the network level – are located on another system, they are still accessible via the web application. In the event of security gaps in the application, the protective measures at the network and system level are generally ineffective.

Measures to ensure application security are therefore of great importance, especially for custom-developed applications.

4. Why should security flaws not be equated with quality defects?

Security flaws, just like quality defects, are errors in the application. In both cases, development-accompanying measures must ensure that they do not occur in the first place. And through ongoing and final testing, the error rate should be further minimized in both cases.

Despite this relationship, it is a – dangerous – mistake to apply the findings and procedures from quality assurance unchanged to security. This is due to the following characteristic: In the case of quality defects, there is usually – very roughly speaking – a proportional relationship between the size of the error and the size of the damage. Small error – small damage, large error – large damage. In the case of security problems, however, the relationship is more in the form of a step function: even a small or hard-to-find error can cause very large damage.

This can be illustrated using the example of a shopping application: Suppose the operator of the shop has changes made to the check-out function. Such a change entails, among other things, the risk that the check-out will no longer function due to faulty programming and that, at least temporarily, a total loss of sales will occur. Nevertheless, the risk of damage accompanying the change is very small, because this worst-case scenario can be easily avoided by appropriately designed final tests. If more complex error constellations – for example: the occurrence of a non-Latin character in the customer name causes the check-out to crash – are not immediately discovered, this is tolerable, the case is rare and the damage is correspondingly small.

The situation is different with security. Let's assume that the aforementioned change consists of the introduction of voucher codes and the associated input field in the check-out. Let's also assume – quite realistically – that there is a high level of time pressure during implementation. Then it can easily happen that the responsible developer does not solidly implement the access to the voucher database required to check the validity of the voucher by appropriately extending the data model, but "just quickly" via a direct database call. (For insiders: Instead of secure persistence via OR mapper, the extremely insecure method of dynamic SQL call).

A so-called SQL injection vulnerability introduced in this way is extremely dangerous because, in the worst case, it can be used to penetrate the database or the entire system. A pure quality test does not detect this vulnerability and it does not otherwise become noticeable in any way. However, the moment it is discovered by an attacker, the maximum damage occurs immediately.

The logic that is justifiable in the area of quality assurance – "We have intensively tested the most critical error cases and everything is in order. If we have overlooked hidden quality defects for reasons of time or cost, this is not pleasing, but not so bad." – must therefore not be transferred to security. The "bad cases" cannot be specifically addressed and excluded here.

The solution to this problem is not to always carry out a complete penetration test even for small changes. The costs and delays that occur do not allow this in practice. Rather, this problem is another reason to anchor application security deeply in the process and to understand the pentest as only one of several measures. See also Question 36 and Question 37.

5. Why should I know about the phenomenon of false sense of security?

Knowing that a risk is not sufficiently covered by security measures (i.e. a security gap exists) means that one exercises appropriate caution in everything related to this security gap, i.e. sets a higher standard with regard to security. The Information Officer in a company where communication in the intranet is unencrypted via HTTP will work towards greater strictness when defining security rules for external parties than the colleague in the company where all applications are only accessible via https. Conversely, the https conversion of the intranet will be carried out as a measure if it turns out that the required strictness cannot be implemented.

Or, in relation to web applications: The realization that one is poorly positioned with regard to application security leads to the decision not to provide sensitive applications via the web in the first place. If, on the other hand, one is doing well here, for example because one systematically operates pentesting, the decision is more likely to be in the direction of approving a sensitive application.

A problem arises when there is a gap between what one thinks one has and what really is in terms of the level of security available. In the first example: The security rules are loosely interpreted because the intranet is now encrypted – but it is overlooked that this has only been done incompletely. In the second example: The decision to make sensitive data available externally is positive because security is under control by means of pentests – and it is overlooked that the pentest specifications are far too weak for such applications.

In summary: If you find that a security measure can only be implemented inadequately, then you must not rely on it! Sometimes it is even better in such a case not to implement the security measure at all. This avoids decisions being made elsewhere as a result of the resulting false sense of security that endanger security far more than the omitted security measure.

6. Does the European General Data Protection Regulation (GDPR), introduced in May 2018, affect pentesting?

Yes, definitely. Because another threat has been added to the existing ones, namely that of high penalties if personal data falls into the wrong hands or into the public domain via a vulnerability. The application of security measures in general and pentesting in particular have therefore been of even greater importance since the European General Data Protection Regulation came into force.

Penetration Testing Basics

7. What is a Penetration Test?

In a penetration test, the tester assumes the role of an attacker. Using all available means, they attack the application from the outside – in contrast to analysis techniques that start “inside”, such as code analysis – to uncover vulnerabilities. In doing so, they apply their extensive knowledge of vulnerabilities and use all sorts of tricks to bypass security mechanisms. The result is a report that describes the vulnerabilities in a comprehensible manner, assesses them with regard to their potential risk, and suggests countermeasures.

Because they act as a benign “hacker”, the penetration tester is also often called a White-Hat Hacker (as opposed to the malicious Black-Hat Hacker) or Ethical Hacker.

8. And what is an Application Security Penetration Test not?

A penetration test of a web or mobile application is not about recreating the attacker scenario as realistically as possible in order to conclude whether an attacker could penetrate the application or not, and then classifying it as insecure or secure based on the outcome. Rather, the penetration test should be understood as a quality assurance measure. The goal is to identify as many security-relevant anomalies as possible, evaluate them, also in context, and, if they pose an actual risk, address them for remediation. The tester can perform their role most effectively when they receive the best possible support – more on this in Question 10.

9. What does a penetration test have to do with searching for a needle in a haystack?

A lot! To be precise, the underlying principle is pretty much the same. Except that in a penetration test, there are many “needles”. What they both have in common is this unpleasant characteristic: the probability of finding something increases with the duration of the search and with the number of searchers, without ever reaching 100%. Consequently, it is in the nature of things that the “crowd-sourced” internet, i.e., the sum of all attackers trying to hack the web application, is superior to any penetration tester equipped with a limited time budget. This is because both the number and the search duration are potentially unlimited “on the other side”.

10. What can be done to counter the superiority of the attackers?

Because the superiority of the opposing side is so overwhelming when using the method of “stochastic” searching, it is all the more important to counter it with something. The effective countermeasure is to provide a maximum of information. The more the tester knows about the internals of the application, the less they have to poke around aimlessly in the fog, and the more targeted and effective they can use the available time budget.

The information includes, among other things: programming technology, frameworks used, descriptions of the software architecture, security concepts, API descriptions, and, depending on the type of application, other information to be agreed upon individually.

11. What is a retest?

If the penetration test has revealed vulnerabilities that need to be fixed, these should be retested after they have been fixed to the best of one's knowledge. This involves checking whether they have actually been completely eliminated by the measure taken. Strictly speaking, the complete penetration test would have to be carried out again, because a change in one place can lead to security problems in a completely different place. For reasons of cost and time, however, a retest is usually carried out in the following way: It checks whether the vulnerability found no longer occurs at the previously reported point of occurrence. It therefore does not provide the statement that the vulnerability has been fundamentally eliminated, i.e. that it is not present in any other place.

This fact gains further significance in connection with the following paragraph.

12. Does a penetration test find all vulnerabilities?

No, as a rule, it is not possible to conclude after a penetration test that the application contains no further vulnerabilities other than those identified. This is mainly due to the needle-in-a-haystack nature of penetration tests (Question 9).

The following rule can be derived from this:

Rule #1

Do not rely solely on the penetration test, but take additional measures to ensure the security of your application.

13. Does a penetration test always find all the places where a particular vulnerability occurs?

Let's take a normal web application and the SQL injection vulnerability as an example. The application has a number of forms and each form contains a number of fields. SQL injection preferably occurs when using the data entered in the fields. The penetration tester checks, among other things, the forms and fields successively for SQL injection. If they encounter this vulnerability, they describe it in the report. Then they may check further fields, but now only on a random basis. Thus, not every place where SQL injection is present will appear in the report, as the time spent on it can be used much better for searching for other vulnerabilities.

If a vulnerability scanner is used to search for certain types of vulnerabilities (SQL injection is one of the vulnerability categories in which scan tools often provide good services), the list of listed locations is most likely longer, but here too, completeness is not guaranteed. Often, a scanner does not discover all forms and fields or encounters other problems that stand in the way of automatic testing.

The result is: The penetration test report does not always contain all the places where a particular vulnerability occurs, but under certain circumstances only one or a few selected ones. This in turn has considerable implications for the task of fixing the vulnerability. See Question 42.

Please note: A simple retest (Question 11) will no longer detect such errors!

14. What is the value of inexpensive penetration tests (quick test, initial test, low-budget test, low-hanging-fruit test, purely tool-based test)?

In general, any kind of test is better than no test at all! But only under one condition: that one does not fall victim to a false sense of security (Question 5)!

A quick test is generally not a sufficient security measure for applications with normal or high protection requirements (Question 30 and Question 33), but it does a good job of …

achieving a limited, but difficult-to-precisely-quantify, risk reduction.
giving the client an initial indication of the security level.
assessing the necessity and type of further measures.

Characteristics of Penetration Tests

15. Blackbox, Greybox, Whitebox?

Depending on how much information the penetration tester starts with, a penetration test is referred to as a blackbox, greybox, or whitebox penetration test. In a blackbox penetration test, the tester receives no additional information and is therefore in the same situation as an external third party. As we learned in Question 10, this variant is generally not suitable for testing web applications.

The whitebox penetration test represents the desired opposite – full information. This extends to inspecting the source code of the application. Since this is usually associated with a significantly higher effort, the compromise of the greybox penetration test is used far more frequently – maximum information, but no provision of the source code.

The measure of selective code reviews or comprehensive automatic static code analysis is to be distinguished from the whitebox test with code inspection; it is a sensible additional measure to the penetration test.

Rule #2

The penetration tester must be equipped with a maximum of information; a greybox or whitebox penetration test must be carried out.

16. What is the process of a penetration test?

1. Preparation of an offer
The foundation for the successful execution of a penetration test is laid during the offer preparation. The better the service provider is enabled to estimate the scope and overall scope, the smoother it will be. See Question 33 for determining the protection requirements, Question 34 for the complexity of an application, and Question 35 for determining the test effort.

2. Test preparation
The same applies to the quality of the preparation. By the agreed start date, all preparation measures coordinated with the service provider should have been taken. This is not without effort for the client and should be planned accordingly. The experiences addressed in Question 39 show that a lot can go wrong.

3. Test execution
The penetration tester takes on the role of an external attacker. He uses all his experience and the information provided to identify vulnerabilities in the application to be tested, supported by professional tools. Report creation usually goes hand in hand with testing. Here, too, support from the client is required to avoid the problems mentioned in Question 39.

4. Report handover
Because of the problems addressed in Question 42, it is recommended not to conclude the penetration test with the receipt of the report, but to consult with the service provider after thoroughly studying the reported vulnerabilities. See Question 41.

5. Retest
After the vulnerabilities have been fixed, it is recommended to carry out a retest in which – in the most common, simplest case – the parts of the application where vulnerabilities were previously found are checked again. What to consider is discussed in Question 11.

17. What exactly is tested during an application penetration test?

In short, most of what can go wrong during the implementation of the application and has a negative impact on security. This primarily includes programming errors, but also incorrect configuration and logical problems.

The following categorization facilitates the overview:

System level:
Here, the software environment in which the web application is embedded is examined. This includes the secure configuration of the web server, CMS, and other system software required for the application to run. In certain cases, this area is excluded; see Question 24.

Technology:
Are basic security properties of the application guaranteed by the (correct) application of security-providing technologies? This area includes the use of secure encryption methods, secure password storage, the implementation of a security architecture, etc.

Implementation:
This is the major area where it is checked whether the developer has made security-relevant programming errors: SQL injection, XSS, CSRF, etc. – the list of possible vulnerabilities is long.

Logic:
If a function is technically programmed correctly, but possibilities for misuse have been overlooked in the internal processes, this is referred to as logical vulnerabilities. For example, password cracking is not prevented, the registration function can be misused for sending emails en masse, or cases can occur in the password reset procedure that allow a third party to break into the account.

Semantics:
If an attacker is planning a phishing attack, for example, they will look for an application where the users are expected to be particularly vulnerable. This is the case, for example, when users are accustomed to receiving regular emails from a company containing links to the application or even being asked to click on embedded links. Users who are „pre-conditioned“ in this way are far more likely to click on a fraudulent link in a fake email than users who are accustomed to receiving emails without links from their company. Properties that favor these or similar attacks are checked here.

18. How are vulnerabilities assessed?

There is no standardized assessment system for the risk potential of vulnerabilities. Nevertheless, most providers assess vulnerabilities very similarly, according to a scheme in which the risk potential is divided into the categories low, medium, high and critical.

The categories can be roughly interpreted as follows:

Critical:
This vulnerability poses an unacceptable risk. The application must not be launched live; if this has already happened, it must be deactivated immediately.

High:
This vulnerability must be fixed immediately, possibly as part of an emergency patch. Further measures to avoid risk until the problem is resolved should be considered.

Medium:
This vulnerability must be fixed, even if this involves (moderate) additional costs or other (moderate) disadvantages.

Low:
The fix can be included in the release planning as a regular task.

Sometimes another parameter (probability of occurrence or complexity) is added, which provides information about how difficult it is to find and exploit the vulnerability – i.e. how likely it is that the damage will actually occur via the respective vulnerability.

The CVSS (Common Vulnerability Scoring System) methodology presented in Question 19 differentiates even further, with ratings given on a scale of 0 to 10. Although its practicality in the field of web applications is rather limited, it is becoming increasingly widely used.

The pentester makes their assessment primarily on the basis of general „danger scales“ of a type of vulnerability; they can only take application-specific threats and abuse scenarios into account to a very limited extent. It is therefore highly advisable to supplement the report with your own assessment after it has been completed: This involves checking whether different assessments result from the inclusion of factors that the pentester is not aware of.

19. What is the CVSS score and why should I care?

“The Common Vulnerability Scoring System (CVSS) is an industry standard for assessing the severity of potential or actual security vulnerabilities in computer systems.” (Source: Wikipedia) When it was designed, the focus was on “off-the-shelf” applications with regard to application security, but not on the special requirements of customer-specific web applications, which are unique by their very nature. Nevertheless, in the absence of better alternatives, CVSS has become increasingly widespread in recent years for assessing vulnerabilities in customer-specific web applications.

In CVSS, the overall score of a vulnerability results from the combination of a series of isolated assessments of individual aspects (metrics), taking into account the weightings stored in the calculation formula.

Strengths:

Decomposition into metrics to be assessed individually makes the assessment easier for the pentester
Very good comparability both at the level of the overall score and at the level of the metrics
Easy traceability of the overall score based on „drill-down“ to the individual assessments

Weaknesses, related to web applications:

Metrics are included that are unimportant or less important for web applications (example: attack vector metric, temporal score, remediation level metric are defined in a way that has little significance for customer-specific web applications)
Small differences in the input parameters can, under unfavorable circumstances, lead to large differences in the overall score.

So it must be said that the increasing prevalence in the field of application security is due less to its particularly good suitability than to the lack of alternatives.

The following experiences should be taken into account when using the CVSS:

If the comparability of the assessment across many applications and/or different penetration testers is important—for example, in large companies with many applications or when integrating into DevOps processes—CVSS should be considered as the assessment scheme.
Companies that conduct penetration tests occasionally, or where comparability is not important for other reasons, should prioritize simplicity. CVSS is not the first choice in these cases.
CVSS can be counterproductive if the rules for the consequences resulting from a vulnerability are too rigid. For example, if a score exceeding a defined threshold automatically leads to the cancellation of a potentially imminent and business-critical go-live within the deployment process. The process should therefore include a feedback loop to ensure that vulnerability assessments with serious consequences are subject to detailed review and that the CVSS score can be manually adjusted.

20. How reliable is the assessment and how should I handle it?

So-called DAST (Dynamic Application Security Testing) tools or application vulnerability scanners promise to comprehensively and automatically scan a web application for vulnerabilities. However, the wide variety of programming technologies, the high complexity in application logic, and the lack of standardization in the dialog processes between client and server make such automation an unsatisfactorily solved problem to this day. It must be said that DAST tools are justified in specific cases (such as when testing very specific vulnerabilities or in build environments with a high test repetition rate), but they are still unsuitable as the sole means. A reputable penetration tester uses such tools at most to support manual testing and as a supplement to the many small tools in their toolbox.

Penetration tests that are carried out completely or primarily by a tool are generally not suitable as a replacement for a manually performed penetration test.

21. Can penetration tests be automated?

It is hardly possible to objectively and reliably assess a vulnerability in every form and with regard to all requirements. The technical relationships are too complex and the scope for interpretation is too great. The assessment is an estimation based on the information available to the penetration tester—which is usually very limited. When in doubt, the penetration tester will err on the side of caution and classify the risk as higher.

Especially when the elimination of a vulnerability is associated with high direct costs (the remediation itself) or high indirect costs (e.g., due to loss of revenue due to delay or interruption), it is advisable to carry out a more differentiated risk assessment.

This inherent imprecision in the assessment system is also the reason why it is not necessarily a good idea to incorporate unchangeable rules into the release process that prevent an application from being released as long as it contains vulnerabilities of category x. Instead, such a case should trigger a detailed risk assessment, on the basis of which the decision "Go" or "No-Go" is then made.

22. Should the penetration test take place on the test or production system?

There is no clear answer to this question. Even if the risk of damage or side effects occurring as a result of the (manually controlled) penetration test is very low, this scenario should be avoided if possible. In an ideal world, the penetration test should take place in a staging or pre-production environment that corresponds as closely as possible to the live environment.

However, this requirement is often hampered by other framework conditions and the penetration test must be carried out on a less ideal system, e.g. in production operation.

The presence of personal data would also argue against performing the test on the production system. From a legal perspective, penetration testing constitutes data processing on behalf of and is therefore subject to the rules of the GDPR. See also Question 14.

Rule #3

The penetration test should be carried out on the system that proves to be the most suitable, taking all framework conditions into account. Note possible side effects if it is the production system, especially with regard to data protection.

23. Should testing be performed with or without a Web Application Firewall (WAF) and Intrusion Detection System (IDS/IPS)?

This question arises for companies that have perimeter protection in place that filters at the HTTP protocol level, i.e. is intended to protect against abusive input and access to web applications.

The clear answer is:

Rule #4

Perimeter protection (Web Application Firewall, IDS/IPS) must be deactivated during the penetration test.

Otherwise, it will conceal the vulnerabilities in the application. We remember, Question 8: An Application Security Penetration Test is a quality assurance measure. And, more importantly: if a filter rule is subsequently changed—for example, because it causes problems with another application—the protection of all applications dependent on this rule is suddenly at stake.

Exception: If the task is not the penetration test of the web application, but to check the security of the overall system or the effectiveness of the configured filter rules.

In addition to the filter function, some web application firewalls offer further protection functions, such as preventing password cracking attacks or systematically intercepting information through mass access. If the application uses these protection functions, the penetration test should be divided into two parts. After the application has first been comprehensively tested without perimeter protection, the effectiveness of the other protection functions should be checked with the filters activated. Careful coordination in advance is essential to avoid gaps here.

24. Should the underlying infrastructure also be tested?

In large companies with many web applications, these are usually embedded in a professionally operated hosting environment. The security of servers and infrastructure is centrally guaranteed here, so that a penetration test can focus exclusively on the web application.

If this is not the case, it is essential to clarify with the service provider whether the pentest should be extended to the host. In a host test, the following components are examined and the following questions are answered:

Is the server operating system hardened?
Are there no superfluous services and open ports?
Is the web server securely configured?
Are the operating system and all externally visible services up to date with versions for which no vulnerabilities are publicly known?
Are standard software systems used, such as a CMS (e.g. WordPress, Drupal, Joomla) or shopping software (e.g. Magento, Shopify), used securely and are the security settings configured correctly?

In contrast to the application level, automated tools such as nessus or OpenVAS provide good services in this area (the so-called "system level" according to the categories in Question 17).

If hosting is in the cloud, the same rules apply: If the cloud provider is responsible for the underlying infrastructure, the review can be limited to the web application. If the servers are managed by the customer, a test of the host system is advisable. The cloud provider's specifications for pentests must be observed.

25. What if my (cloud) infrastructure is complex and exposed, with a correspondingly large attack surface?

Many modern infrastructures are very distributed, networked and modular, especially in cloud and container environments. Above all, the high level of exposure, high complexity and unclear responsibilities (development vs. operation) often result in a much larger attack surface; for example, due to resources that are accessible without authentication or due to inadequate permission checks across multiple system and abstraction levels.

In such cases, the analysis of the security of the overall system must focus more on the architecture and interaction of the components. A cloud audit with a focus on Identity/Access Management, network/infrastructure, data security and logging/monitoring often makes sense.

26. What should be considered when pentesting web applications in the cloud or at hosting providers?

Some cloud or hosting providers prohibit (unannounced) pentesting. The regulations of your own provider should therefore be clarified in good time in advance. During preparation, the pentest service provider should provide support or even be able to completely take these formalities off the client's hands.

27. At what stage of development should the pentest take place?

A pentest that is assigned the task of final security acceptance of a release must be carried out on the final state of this release before it goes live. Subsequent changes should (actually: must!) no longer take place, as this would reduce confidence in the results of the pentest. Changes that are due to fixing found vulnerabilities should be verified by means of a retest (see Question 11).

In addition, a (comprehensive or partial) pentest can take place at any time, provided that a runnable development status is available. However, such preliminary pentests do not replace the final pentest.

28. How often should a pentest take place?

In theory, after every change, because every change can introduce security-relevant errors, even those with massive consequences.

In practice, however, the situation is often different: If an application has already been subjected to a pentest, "seemingly" small changes are usually released without renewed security checks. With somewhat more extensive changes, there is often a desire to only test the changed or newly added functional areas. Only with major changes and when the pentest was a long time ago does the realization dawn that another complete pentest must be carried out.

If the quality criterion of security has been integrated into the development process from the outset, i.e. the recommendations mentioned in Question 37 have been followed, this pragmatic approach may be sufficient. Without this prerequisite, however, this frequently observed procedure must be classified as risky.

In any case, the following applies: At the latest when major changes are made to the architecture and functionality, the original penetration test loses its validity and a new comprehensive test is necessary!

29. How do you test with agile development and continuous integration?

Modern agile development generally requires a different approach to pentests. Here, the motto is to test individual, already completed components of the application early in the development cycle in order to identify possible weaknesses as early as possible and to avoid similar problems in further development. In subsequent releases, individual completed features or a sum of changes can then be tested for security (feature or delta tests). Even if the vulnerability coverage of automated security tests is still very limited, as much automation as possible should be built into the process chain to detect vulnerabilities.

With the agile development process, it is particularly important to free yourself from dependence on the pentest, not least because of the poor automation capabilities of pentests and the short release cycles. See also Question 37.

Scope and price of a pentest

30. What is the protection requirement?

It should come as no surprise that a bank should invest more effort in testing its online banking than the provider of a parking search app. But what criteria should be used as a guide? This is where the protection requirement comes into play. The higher the potential damage in the event of misuse and the more likely a case of misuse is, the higher the protection requirement. An application with a high protection requirement requires a greater testing depth and scope – in relation to the penetration test, in short: more intensive searching – than an application with a low protection requirement. More frequent testing and the use of further analysis techniques, such as code analysis in particular (see Question 37), are also suitable measures for applications with a high protection requirement.

Determining the protection requirement is often not trivial. In addition to the effects on business processes, the primary factor is how worthy of protection the processed data is. It should also not be underestimated that even if the application and data are considered to have limited protection requirements, the loss of trust (damage to reputation) caused by a security incident can be considerable and, in the worst case, could jeopardize the business model.

31. Why does the protection requirement play a role?

Because of the costs. Since applications cannot be tested comprehensively by a machine, the testing effort and price are closely linked. The aim is therefore to determine the level of effort required to achieve an acceptable risk reduction at minimal cost. See the term risk management in Question 1.

32. In what unit is the protection requirement measured?

The metric for classifying the protection requirement is not clearly defined. It usually consists of three to five levels. For penetration testing purposes, the following 3-level classification has proven to be practical: normal, high and very high. Strictly speaking, there is a fourth level, namely the one with no protection requirement at all. Since a penetration test is not required for such applications, we do not need it here either.

The protection requirement of an application is rated as high if a compromise leads to considerable and usually unacceptable damage. If the damage were to be existentially threatening, the protection requirement would have to be classified as very high. All other applications worthy of protection are assigned to the normal level.

33. How do I determine the protection requirement of my application?

The topic of protection requirement assessment goes far beyond IT security. It encompasses all corporate risks and is a science in itself that has produced sophisticated methods. Large companies are generally well positioned here and have a comprehensive risk management system in which the protection requirements of applications and data are embedded. The following is therefore aimed at those who do not have such a system and would like to determine the protection requirement of the application to be tested pragmatically and with simple means.

In order to assess the protection requirement, we must start with the maximum damage scenario. Imagine the following: An attacker has penetrated your application or website and has gained full access. They are able to read any data or manipulate it in any way. They can change the page content, deposit completely obvious or difficult-to-detect false information or install malicious code (Trojans, viruses, ransomware, etc.) with the aim of deceiving users who trust your website. They can nest undetected in the server and carry out further malicious activities from there. Think of the maximum damage that could result from the compromise of the application.

Would the impact of such an incident on your company's business activities have serious consequences? Then the high level is appropriate. If the effects are even to be regarded as a threat to business, the very high level must be selected. In all other cases, the protection requirement would be classified as normal.

The type of data must also be taken into account in the assessment. If personal data such as name, email address or address data is involved, the normal level is generally no longer appropriate, but high is the right choice. This circumstance arises solely from the legal consequences of handling personal data (in the worst case, negligently). If, in addition, highly sensitive data such as credit card, payment, account information, health data, other confidential personal data, business secrets or sensitive financial data of the company is accessible for the application, the classification very high is appropriate.

A particular potential danger that must be taken into account is the impact on the company's public image – the damage to its reputation. Regardless of the actual damage, which may be rather minor, this is often very high. Simplistic or improper reporting in the press or the deliberate exaggeration by competitors or groups not well-disposed towards the company can considerably increase the damage to its reputation.

Rule #5

If not already available, a simple protection requirement assessment must be carried out according to the procedure described. When in doubt, it is better to play it safe and choose the next higher level.

34. What else has an impact on the price?

The second key factor in determining the testing effort is the size of the application: the number of forms and fields, the length of dialog sequences, the amount of functions, number and type of roles and rights, etc. Since, in addition to these parameters, there are usually others to be taken into account – such as the complexity of dialog sequences and use cases – the term complexity of the application is usually used instead of size.

A third factor is often added, namely when looking at the externally visible and accessible part of the application is not sufficient to check the security of a system. For example, if files are received by the application via upload and processed internally without the results returning to the web application. It must therefore be clarified whether such areas are to be tested, i.e. the test scope must be defined. This must be documented exactly in the final report, particularly if aspects have been excluded, so that it is clear that the penetration test result only provides a partial statement.

The estimated testing effort, and thus the price, is therefore roughly the product of the protection requirement and complexity of the application, taking into account the test scope. By the way, there is a natural limit to the applicability of penetration testing. From a certain degree of complexity, the penetration test can no longer be applied comprehensively. Then it is at most partially considered, in addition to tools such as static code analysis (Question 37).

35. How is the final testing effort determined?

Parameter 1, the protection requirement, is contributed by the operator of the application and is usefully questioned again by the service provider. Parameter 2, the complexity, must be determined as an essential input for the service provider to prepare the quotation. Various procedures are available for this, which are listed here with their advantages and disadvantages:

1. Self-inspection
If the application already exists and all areas to be tested are accessible from the outside, the penetration testing service provider is provided with one or – in the case of different roles – several test users. The provider clicks through the application and independently gains an overview.

Advantages:
Low effort for the client
High accuracy

Disadvantages:
Only possible if the application is accessible to the tester (from the outside)

2. Walk-Through
A web meeting is set up and the client guides the penetration testing service provider through the application.

Advantages:
High accuracy
Enables follow-up questions and facilitates coordination
Disadvantages:
Involves (moderate) effort for the client

3. Based on Documents
Especially when the application is not yet finished, there is the possibility of assessment based on specifications, mockups, and functional descriptions.

Advantages:
Low effort
Disadvantages:
Low accuracy

4. By Example
If the application is a 'typical' representative of an application area (e.g., shop, employee portal, discussion forum, etc.), the estimate can be made based on naming the application area plus answering questions.

Advantages:
Minimal effort for both sides
Fast
Disadvantages:
Prone to errors due to the potential for misunderstandings in communication between client and contractor

Regardless of the method chosen, the most complete possible transmission of the information mentioned in Question 10 contributes to a smooth process and an accurate estimate.

Rule #6

Support the creation of quotations by providing comprehensive and reliable information.

36. Is the pentest sufficient as a means of establishing security?

The answer to this question can best be illustrated with this practical experience: In applications that were developed without regard to the quality characteristic of security – for example, because one assumes that security can be subsequently established based on the pentest results – the pentest predictably delivers a considerable number of vulnerabilities. The really worrying thing, however, is the following: In such applications, the retest after fixing the vulnerabilities disproportionately often shows that the reported vulnerabilities have not been properly eliminated. And later pentests on subsequent releases of the same application usually show the reappearance of previously reported vulnerabilities, perhaps in other places or in a slightly different guise.

Bottom line: If one tries to make an application secure – quasi subsequent to development – via a pentest and the subsequent elimination of the problems identified therein, it will be difficult to ever really get its security under control. Throughout the entire lifecycle of the application, one will have to struggle with unplanned efforts as well as the associated additional costs and delays. Instead, the quality characteristic of security should be anchored in the development cycle (Question 37).

The following applies:

Rule #7

The role of the pentest is not to establish the security of an application. It serves to determine whether a sufficient level of security is present.

37. What else should be done for security?

This question goes far beyond a pentest FAQ, so the most important measures are only briefly listed here. A contact point for comprehensive information on application security is the website of the "Open Worldwide Application Security Project" (OWASP, www.owasp.org).

If you are still at the beginning of a new software project, you are in the fortunate position of being able to carry out the following extremely effective measures:

Train developers: Seminars on introduction to web application security and secure coding training should be a matter of course for all team members.
Conduct a security architecture workshop: The initial considerations regarding the system and software architecture of the application should be coordinated with an application security specialist with regard to the consideration of security aspects.
Building Security In: Include the quality characteristic of security in the development process from the beginning.
Make code inspections for security a development-accompanying measure.
Especially with the agile approach and in continuous integration environments, expand the build chain to include tools for automated security testing, as far as the state of the art allows.

For existing applications:

Architecture analysis: Analyzing the application's integration into the surrounding system landscape, communication relationships, trust transitions, etc.
Static code analysis in the form of manual code inspection of particularly security-relevant code areas or as comprehensive code analysis using commercial SAST tools.

38. Is it sufficient to test only internet applications, or should intranet applications also be tested?

The security representative would naturally say that all intranet applications requiring protection must also be tested. The pragmatist, on the other hand, will argue that this should depend on the trust status of the respective environment and that the internet naturally has a lower or no trust status. The classification is therefore subject to security principles at a higher level; a universally valid answer cannot be given.

However, the following should be considered: Many applications that once started as pure intranet applications – and without special security requirements – eventually experience an expansion of the user group, from their own employees to external partners, and are then opened for access via VPN and, finally, for further simplification, also for access via the internet. If this happens without the introduction of rigid security and testing measures, high risks arise.

39. How can I contribute to maximizing the quality of the penetration test?

The quality of a penetration test does not depend solely on the quality of the penetration tester. Good organization and communication, as well as a smooth process, also have a major influence. The client can contribute significantly to maximizing quality by avoiding these mistakes:

Late or incomplete provision of the application by the agreed start date.
Test users that do not function or are not equipped with the necessary rights.
Access problems due to hindering firewall settings (It must be checked in advance whether access also works from the outside!).
Failures of the application or individual parts during the test execution.
Parallel tests or deployments of the application that lead to data changes or changes to the functional scope. (These pull the rug out from under the tester's feet, so to speak, because his reference points are no longer correct.)
Non-availability of a contact person who can provide information.
Missing or incomplete test or sample data in the application.
Missing or incomplete description of API calls.

Rule #8

By creating the most ideal framework conditions possible, the client can make a significant contribution to maximizing the rate of detected vulnerabilities.

Dealing with the results report

40. How much time must be planned for fixing vulnerabilities?

A common mistake is that too little time is planned between the end of the penetration test and the go-live date. If vulnerabilities are found, there is no longer enough time for a thorough fix, and the application is either patched "in a hurry" (see Question 42) or launched with vulnerabilities that have not yet been fixed. Critical vulnerabilities can then throw the entire plan overboard at the last minute.

Rule #9

Assume that a penetration test will find vulnerabilities and therefore plan enough time for remediation.

41. How do I deal with the penetration test report?

It is important that the vulnerabilities found are well understood, both in terms of their impact and their cause. If this is not the case, the penetration tester should be contacted for support. (The penetration testing service provider should offer this service).  

Further tips for a pragmatic approach to the results: ‍

In general, the assessments should be reviewed again, as the context of vulnerabilities in the actual environment is sometimes difficult to assess from an external penetration tester's perspective (see Question 18 and Question 19).
Take vulnerabilities classified as critical very seriously and act immediately, see Question 18!
Address vulnerabilities classified from high to low in descending order of potential risk, i.e., allocate the effort – that is, the budget and time available for remediation – largely to the serious vulnerabilities and with correspondingly decreasing priority to the remaining vulnerabilities.
It should not be disregarded that a penetration test sometimes only finds indications of a vulnerability but cannot prove it. Often, the report then includes an assessment with a lower potential risk, supplemented by a corresponding remark. If a finding contains such remarks, it should definitely be investigated to rule out that the risk has been underestimated.

The other side of the coin should also be considered: Security is not an end in itself! The advantage of higher security often entails accepting disadvantages. Before becoming overly reactive, a vulnerability whose remediation would have side effects should be subjected to a sober cost-benefit analysis.

42. What should be considered when fixing a found vulnerability?

A quick fix that makes the reported vulnerability disappear is usually not sufficient. As shown in Question 13, the goal of a penetration test is not to find all locations where a vulnerability occurs, but merely to prove the existence of the vulnerability and to demonstrate it in the report using a specific occurrence as an example.

Therefore, the problem must be solved fundamentally and permanently, and the following rule must be followed:

Rule #10

The root cause of the problem must be sought and the vulnerability must be fixed there.

If only the symptom of the vulnerability is fixed at the point of occurrence, the same vulnerability may continue to exist elsewhere. Or you or your colleagues who are developing the next release will reintroduce the same vulnerability.

Last but not least

43. OWASP Top 10, OWASP Testing Guide, ASVS, CWE – which penetration test is right for me?

There are various vulnerability collections that penetration tests of web applications are based on. The following should be mentioned in particular:

The OWASP Top 10 are a compilation of the "10 most common risks for web applications". They were intended as a contribution to raising awareness in the field of web applications. The rapidly gained popularity has prompted security providers to advertise their tools and services – often incorrectly – with the feature "covers the OWASP Top 10". This has led to the OWASP Top 10 today being widely understood as the standard for what a penetration test or security analysis should check. They cannot live up to this:

Most of the risks included are named quite generally and exemplarily, thus leaving much room for interpretation with regard to concrete tests. This makes them not particularly useful as a contractual basis.
Only the technical areas are covered, but not the business logic (related to the systematics in Question 17: Levels 4 and 5 are not taken into account)
Important technical vulnerabilities (example: Cross-Site Request Forgery (CSRF)) are not included
One of the rules (A10, concerning logging) cannot be checked at all in the general case by means of penetration testing.

In summary: The client-side requirement "Please perform a penetration test according to OWASP Top 10" or the contractor-side offer "We test according to OWASP Top 10" are both very vague and, as a rule, insufficient, even with a broad interpretation. It should be specified in more detail and supplemented with additional tests.

The OWASP Testing Guide (OTG) is an extensive collection of vulnerabilities and instructions on how to find them. The OTG also does not aim to be used as a vulnerability catalog. Rather, with its description of penetration testing techniques, it is primarily aimed at (aspiring) penetration testers. Due to the much higher degree of completeness and detail, it is nevertheless far better suited as a basis for commissioning than the OWASP Top 10.

The OWASP Application Security Verification Standard (ASVS) claims to be a comprehensive framework of security requirements and measures (security controls). It covers functional and non-functional requirements and extends not only to testing but also to the design and development of applications. Since its first appearance in 2009, the framework has been significantly revised several times, incorporating experience from user groups, and is now available in a correspondingly mature version 4.

ASVS defines three levels of stringency for the tests. Level 1 corresponds approximately to the protection requirement normal, Level 2 to the protection requirement high and Level 3 to the protection requirement very high according to the nomenclature discussed in Question 32.

Only Level 1 can be covered by penetration tests alone; the other two levels contain tests that cannot be performed by looking "from the outside" or even explicitly require further test methods.

The specification of a "penetration test according to ASVS 4 Level 1" is therefore not sufficient for many applications. And the specification "penetration test according to ASVS Level 2" is contradictory in itself, since Level 2 cannot be covered by penetration tests alone. Also, the specification regarding the latter, to only perform the tests from Level 2 that can be carried out by means of penetration testing, is not meaningful, since these tests cannot be clearly identified because they are tailored to other test methods than penetration testing.

The Common Weakness Enumeration (CWE) is a continuously updated list of application security vulnerabilities. It aims to be a benchmark for the coverage of security tools and a baseline for vulnerability testing and security measures. The list is very detailed, currently containing over 800 entries. However, the vulnerability descriptions do not provide information on whether a vulnerability can be tested with a penetration test. Therefore, it is also not well suited for sharply defining the scope of a penetration test. Nevertheless, the requirement for the penetration tester to cover the CWE as comprehensively as possible is the most accurate of all the methods mentioned here.

Ultimately, it must be stated that there is no (comprehensively applicable) penetration testing standard that can be referenced to compare offers and commission penetration tests. Unless company policies explicitly specify otherwise, trust in the penetration testing service provider, after detailed consultation, should be the deciding factor for the test procedure and scope.

44. How do I find the right penetration tester for my application?

There is no standard suitable for all situations and requirements that makes it easy to formulate requirements for the penetration tester (see Question 43). To find the appropriate approach for your own needs, you should seek advice from relevant experts.

Tips for choosing a reliable provider:

Ask how long the provider has been performing application security penetration tests / how many applications they have tested / how much experience they have with companies and applications of your size / for which (similar) companies they have already performed penetration tests.
Ask which testing standards the provider uses and how they ensure that the test scope is continuously adapted to the technical development.
Demand that the test is also carried out by the person or persons who possess the relevant experience.
Have them show you a detailed description of their procedure and a sample report.

Compare several providers using this information.

mgm technology partners

mgm consulting partners

mgm integration partners

mgm security partners

QFS Quality First Software

Application Security Testing

Cyber Security Testing

Infrastructure Security Testing

Governance & Compliance

Security Culture & Human Risk

Application Security Basics

Application Security in Practice

Penetration Testing