[Reddes] [Reddes.bvs-tech] indizacion de texto html

spinaker spinaker at adinet.com.uy
Wed Apr 4 16:21:40 BRST 2012


Tengo una base con un campo que contiene el texto completo de una pagina 
Quiero indizar con tecnica 4/8, o sea palabra por palabra
pero NO quiero indizar la informacion de las etiquetas html
por ejemplo
<TITLE>*National water development report for Ethiopia; A WWAP case study
prepared for the 2nd UN world water development report: Water, a shared
responsibility (2006); 2006*</TITLE>*
The World Water Assessment Program was initiated in 2000 as a global mecha
ism for *<br>*measuring and reporting Progress with achievement of 
bjectives in the water *<br>*sector as part of the international 
sustainable deve
opment agenda, formalized in 1992, and *<br>*re-evaluated and focused on 
g countries in 2002 in Johannesburg.  The first *<br>*World Water 
Development Rep
rt was produced in 2003 and released at the 3rd World *<br>*Water Forum 
in Kyoto.
Hay algún procedimiento para descartar informacion entre elementos delim 
< >  de la indizacion?

Otra pregunta
Estoy intentando usar el proc= *Gdump[/<tag>][/nonl][/xml][=<file>]*

  pero tengo problemas con algunos parametros y no sé qué cosa hago mal. 
Si uso
*>mx pepe1 "proc='Gdump/1=xxx.txt'"*   funciona bien

pero si uso alguno de los otros parametros

*c:\temp> mx pepe1 "proc='Gdump/1/xml=xxx.txt'"
fatal: fldupdat/procx/Gdump/option

c:\temp> mx pepe1 "proc='Gdump/1/nonl=xxx.txt'"
fatal: fldupdat/procx/Gdump/option

c:\temp> mx pepe1 "proc='Gdump/nonl=xxx.txt'"
fatal: fldupdat/procx/Gdump/option*

Ernesto Spinak

   .^.                                .^.
   ( )                                ( )
   ===                                ===
   | |  Ernesto Spinak                | |
   | |  spinaker at adinet.com.uy        | |
   | |  Montevideo, Uruguay           | |
   | |  tel/fax  (598) 2622-3352      | |
   | |  celular  (598) 99612238      | |
   ===                                ===
   ( )                                ( )
    V                                  V

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://listas.bireme.br/pipermail/reddes/attachments/20120404/421c82c9/attachment-0001.html 
-------------- next part --------------
Reddes.bvs-tech mailing list
Reddes.bvs-tech at listas.bireme.br

More information about the Reddes mailing list