Get a delimited list in Oracle SQL

There are many times that I need to retrieve a delimited list in SQL and every time I have to track the solution down.

 

So here it is.

SELECT RTRIM (
 rtrim(XMLAGG (XMLELEMENT (e, column_name || ',')).EXTRACT ('//text()'), ',') name
 FROM table_name;

 

Just to break this down a little

rtrim(XMLELEMENT (e, column_name || 'This is the delimiter')).EXTRACT 
('//text()'), 'This is the delimiter')



Advertisements

Converting PDFs’ into text

PDFtoTXT

I am currently working on a project to extract the text from a PDF purchase order and create an xml file to feed into our ERP system, fully automated obviously.

I therefore needed a way to extract the text from a PDF at the command line, luckily there is a cool utility called poppler-utils. This works perfectly and is extremely fast.

I have installed this on both a CentOS server and more recently and Ubuntu server,

The caveat to this product is that you need the PDF to be real and not an image, this is why the Ubuntu server may be used.

mkdir ~/software
wget https://poppler.freedesktop.org/poppler-0.40.0.tar.xz
wget https://poppler.freedesktop.org/poppler-data-0.4.7.tar.gz
tar xvf poppler-0.40.0.tar.xz
tar xvf poppler-data-0.4.7.tar.gz

There are a few requirements

yum -y install fontconfig fontconfig-devel
yum -y install cairo cairo-devel libjpeg libjpeg-devel libcurl-devel gtk-doc 
yum -y install libtool gcc-c++ lcms2 openjpeg-libs xz openjpeg-devel
yum -y install libtiff-devel lcms2-devel

cd poppler-data-0.4.7
make install

tar (child): xz: Cannot exec: No such file or directory
tar (child): Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error is not recoverable: exiting now

Ensure that you install xz

cd poppler-0.40.0
./configure

ERROR

checking for a BSD-compatible install… /usr/bin/install -c
checking whether build environment is sane… yes
checking for a thread-safe mkdir -p… /bin/mkdir -p
checking for gawk… gawk
checking whether make sets $(MAKE)… yes
checking whether make supports nested variables… yes
checking whether make supports nested variables… (cached) yes
checking for style of include used by make… GNU
checking for gcc… no
checking for cc… no
checking for cl.exe… no
configure: error: in `/root/software/poppler-0.40.0′:
configure: error: no acceptable C compiler found in $PATH
See `config.log’ for more details

ERROR

No package ‘fontconfig’ found

Ensure that you installed  fontconfig fontconfig-devel

INFORMATION

Building poppler with support for:
font configuration: fontconfig
splash output: yes
cairo output: no (requires cairo >= 1.10.0)
qt4 wrapper: no
qt5 wrapper: no
glib wrapper: no (requires cairo output)
introspection: no
cpp wrapper: yes
use gtk-doc: no
use libjpeg: no
use libpng: no
use libtiff: no
use zlib: no
use libcurl: no
use libopenjpeg: no
use cms: no
command line utils: yes
test data dir: /root/software/poppler-0.40.0/./../test

Warning: Using libjpeg is recommended. The internal DCT decoder is unmaintained.
Warning: Using libopenjpeg is recommended. The internal JPX decoder is unmaintained.

Ensure that you installed  cairo cairo-devel libjpeg libjpeg-devel libcurl-devel gtk-doc

make
make all-recursive

make[1]: Entering directory `/root/software/poppler-0.40.0′
Making all in goo
make[2]: Entering directory `/root/software/poppler-0.40.0/goo’
CXX libgoo_la-gfile.lo
../libtool: line 1129: g++: command not found
make[2]: *** [libgoo_la-gfile.lo] Error 1
make[2]: Leaving directory `/root/software/poppler-0.40.0/goo’
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/root/software/poppler-0.40.0′
make: *** [all] Error 2

Ensure that you  installed libtool gcc-c++ lcms-libs openjpeg-libs

make install

You should now have a working copy of the software.

To convert a pdf you just simply issue

pdftotext -layout PDF_Name  Output_Filename

 

 

Ubuntu Install

mkdir ~/software

apt-get -y install libcurl4-gnutls-dev libcairo2-dev libcairo2 libjpeg libjpeg-dev \
libtiff5-dev libgtk-doc g++ fontconfig fontconfig-dev fontconfig*

wget https://launchpad.net/ubuntu/+archive/primary/+files/poppler-data_0.4.6.orig.tar.gz
wget https://launchpad.net/ubuntu/+archive/primary/+files/poppler_0.41.0.orig.tar.xz

tar -xvf poppler-data_0.4.6.orig.tar.gz

tar -xvf poppler_0.41.0.orig.tar.xz

cd 

cd poppler-data-0.4.6/

make install

cd poppler-0.41.0/

./configure –prefix=/usr

make

make install

 


Technical Bits n Pieces

This really is and aide memoire, a scrap pad, random commands etc ……. as I come across errors, workarounds, fixes.

This is all about making those notes so that I don’t have to trawl Google again to find the answers and at least I know these commands work 🙂

How to grep IP addresses in a file?

I needed to tail a file for IP addresses as some thing was using a large amount of web server resource. The log file is called httpd-access

 tail -f httpd-access_log | grep -E -o “([0-9]{1,3}[\.]){3}[0-9]{1,3}”

How to deny and ip address from accessing your server?

Yes you can use a firewall, but if you are behind a load balancer or proxy server then you can not.

In either the httpd.conf or the .htaccess file add the following lines :-

httpd.conf – add these to the directory element of the file

Order allow,deny
Allow from all
deny from Enter IP address

At the end of the file

<Files 403.shtml>
order allow,deny
allow from all

.htaccess – add this to the IfModule element, You can deny IP addresses or use a RewriteCond and Rule.

# Block Bad Bots
RewriteCond %{HTTP_USER_AGENT} ^.*(Vagabondo).*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*(AhrefsBot).*$ [NC]
# ISSUE 403 / SERVE ERRORDOCUMENT
RewriteRule . - [F,L]
order allow,deny
allow from all

deny from Enter Ip address here

Add this at the end of the file

<Files 403.shtml>
order allow,deny
allow from all
</Files>

The server will redirect traffic to a 403 error.

 

How to log the source IP address when behind a load balancer or proxy?

You use a load balancer to spread the load of your web sites, but how can you find the ip address of who is using your site, this is specific to the Apache server but something similar will work on nginx I am sure.

In the httpd.conf file change the

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined

to

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" \"%{X-Forwarded-For}i\"" combined

You now should create some custom logs files, again in httpd.conf add the following lines to the virtualhost element

ErrorLog logs/httpd-error_log
CustomLog logs/httpd-access_log combined

…..

Allowing SSH access to Ubuntu 14.0 LTS Desktop

We are running some software that needs a Linux desktop, I am not a big fan of Ubuntu but it is one of the listed supported platforms.

I obviously need vnc access, wow what a pain. Centos / Redhat easy to do Ubuntu !!!!

So I though ssh on and just configure, well there is no SSH access out of the box. Build for the GUI generation. To get ssh access

sudo apt-get update
sudo apt-get install openssh-server

 

You now have ssh access!!!

 

Create a patch file and then the application of the patch

If you need to create a differential file for redeployment on another system then creating a patch file is ideal.

You will need the patch utilities installed

yum install patch

Now to create your patch file. Copy the original file into the same directory

e.g. cp config.php config.php_chg

Now update the config.php_chg file with all of your amendments

Now to create the file

diff -Naur config.php config.php_chg > config.php.patch

If you want to now apply the patch

patch config.php < config.php.patch

You now should keep the file in a safe place or if it is not a sensitive change, i.e. containing passwords then you can add it to https://gist.github.com/

 

 

 

 

 

 


		

Creating self signed certificate

Again this is more an aide memoire.

Whilst waiting for my real SSL certificates to be create I wanted to continue my configuration of the server that I am building.

I therefore wanted a self signed certificate to try the config.

mkdir ~/sslselfsign
cd ~/sslselfsign
openssl genrsa -des3 -passout pass:x -out server.pass.key 2048
openssl rsa -passin pass:x -in server.pass.key -out server.key
rm -f server.pass.key
openssl req -new -key server.key -out server.csr

 

You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter ‘.’, the field will be left blank.
—–

Country Name (2 letter code) [XX]:GB
State or Province Name (full name) []:County_Name
Locality Name (eg, city) [Default City]:City_Name
Organization Name (eg, company) [Default Company Ltd]:Comany_Name
Organizational Unit Name (eg, section) []:IT
Common Name (eg, your name or your server’s hostname) []:xxxx.xxxxx.com
Email Address []:

Please enter the following ‘extra’ attributes
to be sent with your certificate request
A challenge password []:
An optional company name []:

 openssl x509 -req -days 365 -in server.csr -signkey server.key -out server.crt

you should now have 3 files

ls -l

server.crt
server.csr
server.key


How to debug selinux

I am creating an externally facing server and need it to be as hardened as possible, normally for internal servers I just disable selinux as there the risk of hacking is virtually non existent.

To keep the security up on a external server though selinux should stay enabled.

To use the tools you will need to install the following :-

yum install policycoreutils-python
yum install setroubleshoot-server

Installing this you now have access to /var/log/audit/audit.log

tail -n 50 /var/log/audit/audit.log

Will give you something like this

type=AVC msg=audit(1452761897.495:221): avc: denied { unlink } for pid=2013 comm=”squid” name=”squid-cf__metadata.shm” dev=tmpfs ino=32533 scontext=unconfined_u:system_r:squid_t:s0 tcontext=unconfin
ed_u:object_r:tmpfs_t:s0 tclass=file

There are some tools that you can use to decipher this

mkdir ~/selinux_output
cd ~/selinux_output

You need to identify the application that you are trying to debug, in this instance it is squid as highlighted in the log above.

If you have a large audit log I would use the following commands

service auditd stop
rm -f /var/log/audit/audit.log
service auditd start

Now start the service that has failed or you are trying to debug

service squid start

Now to extract some details as create a policy file

cd ~/selinux_output
rm -f myaudit.txt
sealert -a /var/log/audit/audit.log > myaudit.txt

You can review what errors are being highlighted in the audit files. This is needed if you are constantly debugging the application

mv myapplication.te myapplication.bak

Now to create the policy file change squid for whichever application you are trying to debug

grep squid /var/log/audit/audit.log | audit2allow -m squid > myapplication.te

When the myapplication.te file has been created review it.

There will be a line at the top formatted something like this.

module squid 1.0;

Ensure that you prefix what ever the application is with my, this ensure that there will not be a conflict with pre-configured policies or future policies when they released.

module mysquid 1.0;

Now we need to create the policy and install it.

Again the first two lines are for when you are constantly updating the policy

rm -f *.mod
rm -f *.pp
checkmodule -M -m -o myapplication.mod myapplication.te
 semodule_package -o myapplication.pp -m myapplication.mod
 semodule -i myapplication.pp

You then need to test the application and keep repeating the exercise. Please note that you will need to merge the two policies together as you create them so in the example above merge the require and the policy from the .bak file with the newly created .te file. An example of a file that has been created from around three or four tries of the above process.


module mysquid 1.0;
require {
        type tmpfs_t;
        type squid_t;
        class dir { write read create add_name remove_name };
        class file { read write create add_name unlink };
}
#============= squid_t ==============
#!!!! The source type 'squid_t' can write to a 'dir' of the following types:
# var_log_t, var_run_t, pcscd_var_run_t, squid_var_run_t, squid_cache_t, squid_log_t, cluster_var_lib_t, cluster_var_run_t, root_t, krb5_host_rcache_t, cluster_conf_t, tmp_t
allow squid_t tmpfs_t:dir { write read create add_name remove_name };
allow squid_t tmpfs_t:file { write read create add_name unlink };

 

Some of the references used for this post

https://wiki.centos.org/HowTos/SELinux#head-faa96b3fdd922004cdb988c1989e56191c257c01

http://squid-web-proxy-cache.1019090.n4.nabble.com/squid-announce-Squid-3-5-9-is-available-td4673315.html

 

 

 

 


Generate CSR for an SSL certificate

This is more an aide memoire than a full on tutorial.

I am creating a Squid reverse proxy server and obviously want to encrypt the data up to the server. More on creating a Squid server later as it is a bit of a pain 🙂

A Squid server needs three files server.crt, server.key, cacert.crt.

You need a server with openssl installed. Then just run through the following.

 

mkdir ~/sslgenkey
cd ~/sslgenkey
openssl genrsa -out reporting.key 2048

Enter pass phrase for reporting.key:
Verifying – Enter pass phrase for reporting.key: Enter a password (remember it as you will need it)

 openssl req -new -key reporting.key -out reporting.csr

Enter pass phrase for reporting.key:
You are about to be asked to enter information that will be incorporated
into your certificate request.

What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [XX]:Country_Code
State or Province Name (full name) []:County_Name
Locality Name (eg, city) [Default City]:City_Name
Organization Name (eg, company) [Default Company Ltd]:Company_Name
Organizational Unit Name (eg, section) []:IT
Common Name (eg, your name or your server's hostname) []:FQDN of your server
Email Address []:DO NOT ENTER
Please enter the following 'extra' attributes
to be sent with your certificate request
A challenge password []: DO NOT ENTER
An optional company name []: DO NOT ENTER

You now need to use your favourite SSL provider. When prompted to enter the csr

vi reporting.csr

Copy and paste all of the contents into the area for the CSR and wait for them to create the certificate.

 

 


Installing Elastic Search 1.4.4

I am currently trialling the new version of Sugar Crm 7.6 and the installation now requires an elastic search server.

I instantly went to AWS Elastic search but trying to access it is a bit of a pain, more on that later.

To install elastic search you need to install Java, I am installing this on AWS of course so there are few extra bits and pieces.

 

To start with uninstall OpenJava.

rpm -qa | grep Java

tzdata-java-2015g-1.35.amzn1.noarch
javapackages-tools-0.9.1-1.5.amzn1.noarch
java-1.7.0-openjdk-1.7.0.91-2.6.2.2.63.amzn1.x86_64
Will give you the Java package installed

yum erase java-1.7.0-openjdk

Download the JDK from the oracle site and transfer it to the server

rpm -i jdk-8u65-linux-x64.rpm

Now we want elastic search

Sugar CRM will only work with 1.44 so download this version

mkdir /opt/software
cd /opt/software
wget https://download.elastic.co/elasticsearch/elasticsearch/elasticsearch-1.4.4.noarch.rpm
rpm -i elasticsearch-1.4.4.noarch.rpm

Now to do some configuration

vi /etc/elasticsearch/elasticsearch.yml

Change the following areas

 

################################### Cluster ###################################

# Cluster name identifies your cluster for auto-discovery. If you’re running
# multiple clusters on the same network, make sure you’re using unique names.
#
#cluster.name: elasticsearch
cluster.name: elasticsearch

#################################### Node #####################################

# Node names are generated dynamically on startup, so you’re relieved
# from configuring them manually. You can tie this node to a specific name:
#
#node.name: “Franz Kafka”
node.name: “node1”

#################################### Node #####################################

# Node names are generated dynamically on startup, so you’re relieved
# from configuring them manually. You can tie this node to a specific name:
#
#node.name: “Franz Kafka”
node.name: “node1”

# Every node can be configured to allow or deny being eligible as the master,
# and to allow or deny to store the data.
#
# Allow this node to be eligible as a master node (enabled by default):
#
#node.master: true
node.master: true

#
# Allow this node to store data (enabled by default):
#
#node.data: true
node.data: true

#################################### Paths ####################################

# Path to directory containing configuration (this file and logging.yml):
#
#path.conf: /path/to/conf
path.conf: /etc/elasticsearch

# Path to directory where to store index data allocated for this node.
#
#path.data: /path/to/data
path.data: /opt/elasticsearch/data

#
# Can optionally include more than one location, causing data to be striped across
# the locations (a la RAID 0) on a file level, favouring locations with most free
# space on creation. For example:
#
#path.data: /path/to/data1,/path/to/data2

# Path to temporary files:
#
#path.work: /path/to/work
path.work: /opt/elasticsearch/work

# Path to log files:
#
#path.logs: /path/to/logs
path.logs: /opt/elasticsearch/logs

 

############################## Network And HTTP ###############################

# Elasticsearch, by default, binds itself to the 0.0.0.0 address, and listens
# on port [9200-9300] for HTTP traffic and on port [9300-9400] for node-to-node
# communication. (the range means that if the port is busy, it will automatically
# try the next port).

# Set the bind address specifically (IPv4 or IPv6):
#
#network.bind_host: 192.168.0.1

# Set the address other nodes will use to communicate with this node. If not
# set, it is automatically derived. It must point to an actual IP address.
#
#network.publish_host: 192.168.0.1

# Set both ‘bind_host’ and ‘publish_host’:
#
#network.host: 192.168.0.1

# Set a custom port for the node to node communication (9300 by default):
#
#transport.tcp.port: 9300

# Enable compression for all communication between nodes (disabled by default):
#
#transport.tcp.compress: true

# Set a custom port to listen for HTTP traffic:
#
#http.port: 9200
http.port: 9200

 

You need to create the directory structure that you have created above

 mkdir -p /opt/elasticsearch/work
 mkdir -p /opt/elasticsearch/logs
 mkdir -p /opt/elasticsearch/data
chown -R elasticsearch:elasticsearch /opt/elasticsearch

If you try and start the service you will get the following error :-

service elasticsearch start

elasticsearch: Failed to configure logging…
elasticsearch: org.elasticsearch.ElasticsearchException: Failed to load logging configuration
elasticsearch: at org.elasticsearch.common.logging.log4j.LogConfigurator.resolveConfig(LogConfigurator.java:135)
elasticsearch: at org.elasticsearch.common.logging.log4j.LogConfigurator.configure(LogConfigurator.java:85)
elasticsearch: at org.elasticsearch.bootstrap.Bootstrap.setupLogging(Bootstrap.java:94)
elasticsearch: at org.elasticsearch.bootstrap.Bootstrap.main(Bootstrap.java:178)
elasticsearch: at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:32)
elasticsearch: Caused by: java.nio.file.NoSuchFileException: /usr/share/elasticsearch/config
elasticsearch: at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
elasticsearch: at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
elasticsearch: at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
elasticsearch: at sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(UnixFileAttributeViews.java:55)
elasticsearch: at sun.nio.fs.UnixFileSystemProvider.readAttributes(UnixFileSystemProvider.java:144)
elasticsearch: at sun.nio.fs.LinuxFileSystemProvider.readAttributes(LinuxFileSystemProvider.java:99)
elasticsearch: at java.nio.file.Files.readAttributes(Files.java:1737)
elasticsearch: at java.nio.file.FileTreeWalker.getAttributes(FileTreeWalker.java:225)
elasticsearch: at java.nio.file.FileTreeWalker.visit(FileTreeWalker.java:276)
elasticsearch: at java.nio.file.FileTreeWalker.walk(FileTreeWalker.java:322)
elasticsearch: at java.nio.file.Files.walkFileTree(Files.java:2662)
elasticsearch: at org.elasticsearch.common.logging.log4j.LogConfigurator.resolveConfig(LogConfigurator.java:119)
elasticsearch: … 4 more
elasticsearch: log4j:WARN No appenders could be found for logger (bootstrap).
elasticsearch: log4j:WARN Please initialize the log4j system properly.
elasticsearch: log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
elasticsearch.service: main process exited, code=exited, status=3/NOTIMPLEMENTED
Unit elasticsearch.service entered failed state.

To fix this

mkdir -p /usr/share/elasticsearch/logs
mkdir -p /usr/share/elasticsearch/config
mkdir -p /usr/share/elasticsearch/data
cp /etc/elasticsearch/logging.yml /usr/share/elasticsearch/config/
chown -R elasticsearch:elasticsearch /usr/share/elasticsearch/logs
chown -R elasticsearch:elasticsearch /usr/share/elasticsearch/config
chown -R elasticsearch:elasticsearch /usr/share/elasticsearch/data

To test the connection

curl -X POST 'http://FQDN:9200/tutorial/helloworld/1' -d '{ "message": "Hello World!" }'

Getting the record

curl -X GET 'http://FQDN:9200/tutorial/helloworld/1'

 

Note that is you have configured this in AWS you must use the internal ip address of the server or FQDN that resolves to the internal address.