• Login or register

discuss.effbot.org

  • Popular
  • Recent
  • Elements and Element Trees (Brief Tutorial) (effbot.org)

    This note introduces the Element, SubElement and ElementTree types available in the effbot.org elementtree library. For an overview, with links to articles and more documentation, see the ElementTree Overview page.

    1 point by effbot 11 months ago
    • 4 comments
  • 1 point by orsenthil 21 days ago 0 children

    Confusing Example

    The element type also provides a text attribute, which can be used to hold additional data associated with the element. As the name implies, this attribute is usually used to hold a text string, but it can be used for other, application-specific purposes.

    from elementtree.ElementTree import Element

    elem = Element("tag")
    elem.text = "this element also contains text"

    If there is no additional data, this attribute is set to an empty string, or None.

    The element type actually provides two attributes that can be used in this way; in addition to text, there’s a similar attribute called tail. It too can contain a text string, an application-specific object, or None. The tail attribute is used to store trailing text nodes when reading mixed-content XML files; text that follows directly after an element are stored in the tail attribute for that element:

    <tag><elem>this goes into elem's
    text attribute</elem>this goes into
    elem's tail attribute</tag>

    One tends to correlate the above xml brief with the snippet due to "tag" and "elem" objects. Only by careful observation one can realize that they are unrelated.

    • link
    • reply
  • 1 point by cowtung 1 month ago 0 children

    Convert ElemntTree to a dictionary, huzzah.


    class PropDict(dict):
    """Simple dict wrapper that allows element access as an attribute"""
    def __getattr__(self, name):
    return self[name]

    def dictify(elem, collections=[]):
    """Convert from ElementTree.Element to a PropDict hierarchy. (Easily modifyable to return regular dicts.)
    If you specify tag names in collections, all sub-elements will be sewn up into the parent collection:
    <obj><lics><lic><id>1</id></lic><lic><id>2</id></lic><notlic>blah</notlic></lics></obj> becomes {lics:[{id:1},{id:2},'blah']}
    """
    rval = PropDict()
    for i in elem.getchildren():
    if i.tag in collections:
    rval[i.tag] = []
    for j in i.getchildren():
    rval[i.tag].append(dictify(j, collections))
    elif rval.has_key(i.tag):
    if not isinstance(rval[i.tag], list):
    rval[i.tag] = [rval[i.tag]]
    rval[i.tag].append(dictify(i, collections))
    else:
    children = i.getchildren()
    if len(children):
    rval[i.tag] = dictify(i, collections)
    else:
    rval[i.tag] = i.text
    if not rval.has_key('text') and elem.text and elem.text.strip():
    rval['text'] = elem.text.strip()
    return rval

    • link
    • reply
  • 2 points by directrix1 6 months ago 0 children

    I wrote a small function that will strip out either a list of namespaces (if you pass in ['http://ns1','http://ns2'] as nss) or all namespaces (if you pass in ['*'] as nss). Enjoy:


    def stripns(x,nss):
    a = {}
    for t in x.attrib:
    a[t] = x.attrib[t]
    for dns in nss:
    ns = '{'+dns+'}'
    lenns = len(ns)
    if ns=='{*}':
    x.tag = x.tag[x.tag.find('}')+1:]
    elif x.tag.find(ns)==0:
    x.tag = x.tag[lenns:]
    for i in x.attrib:
    if ns=='{*}':
    del a[i]
    a[i[i.find('}')+1:]] = x.attrib[i]
    elif i.find(ns)==0:
    del a[i]
    a[i[lenns:]] = x.attrib[i]
    x.attrib = a
    for i in x.findall('*'):
    stripns(i,nss)

    • link
    • reply
  • 1 point by lauploix 11 months ago 0 children

    Why isn't there a parent attribute in Element? I understand this is an explicit choice, but why?

    • link
    • reply
  • Widget
  • Recent Comments
  • Leaders
Powered by