PDF Print

There are two types of phrases: those which are either banned or exception. If a word or phrase in the web-pages matches any item in a banned lists then the page is blocked. These must be used with great care for obvious reasons! Similarly, if a word of phrase matches an item in the exception list the the page is not blocked. By default these are very small lists and we have not added to them.

The second type of phraselist is weighted. The items in the weighted lists (which are also categorised) all have a numerical value which may be positive or negative. Items in these lists which appears in the HTML are totaled to give the page a value which is used to rate the page. Each profile has a variable called the naughtinesslimit (not our name!) which can be changed to reflect the age group: if the page's weight is greater than this limit it is blocked and the blocking page is sent to the client's browser rather than the one requested. The naughtiness limit is one of the things E2BN have tuned in collaboration with schools in the region and Becta to meet their accreditation level.

Phraselists allows for a great deal of flexibility and while E2BN are not making the ability to change phraselists generally available an understanding of the general principles will offer some insight into the flexibility of the system.

Phraselist syntax: words.
< word > matches the whole word only.
<word> matches the string word - so, for example, it would match wordsmith; swordsman; & sword.
< word> matches words beginning with word - e.g. wordsmith but NOT sword.
<word > matches words ending with word - e.g. sword but NOT wordsmith or swordsman.
<word1>,
<word2>
Will match if BOTH words appear in the HTML of the page. Each term may be in any of the forms above and will match the HTML text accordingly. The important thing is that only if both (actually ALL as you can have any number of elements) match individually does the combination match [for the logicians it is an AND test].
All the above can appear in both banned and exception lists - i.e. banned words & exception words.
Phraselist syntax: Phrases.
Phrases follow a similar pattern - the angle brackets (< & >) indicating where the match boundary comes. In most cases we are interested in the exact phrase so they follow the first form but this is not always the case.
< this is a phrase > matches the whole phrase.
< word1 word2> An example of this may be to also match plurals. For example < car magazine> would match car magazine & car magazines but not autocar magazine. [This is not a real example!]
<car magazine > This will match both car magazine and autocar magazine but not car magazines.
<car magazine> will match all the previous examples.
< car magazine >,<truck magazine> As above this is an AND: only if both match individually does the combination phrase match.
Weighted Phrases - these follow exactly the syntax above but with the addition of a numerical value.
< word ><50> 50 is added to the page score if this word is found on the page. This could equally well be a word or phrase in any of the above forms.
< word ><-50> 50 is subtracted from the page score if this word is found on the page. This could equally well be a word or phrase in any of the above forms. Why? It is used to provide balance and to prevent overblocking. For example the words medical, news, article, are "goodwords" in the default installation.
< word or phrase >,
< another word or phrase ><25>
As before each of the items in the combination must match something on the page before the whole is considered to be a match and the weight added to the cumulative total.
 
 
© 2024 E2BN Protex Limited
Protex®, E2B® and E2BN® are registered trade marks and trading names of East of England Broadband Network (Company Registration No. 04649057)