Introduction to Oracle SQL Select Queries

By Terry Hawthorne
Director of Technology, Smyth County Public Schools

This document provides basic instruction in Oracle SQL queries. The intended audience is school employees who use Oracle's SQL Developer software to access information in the PowerSchool student information system's data tables. For this reason, this document only covers the use of the SQL Select statement. SQL statements such as Modify, Update, and Insert don't apply to this scenario, since PowerSchool does not permit users to make changes to the underlying data tables through direct SQL access.

To use these queries, you need Oracle's free SQL Developer software installed on your computer. If you are an employee of Smyth County Public Schools, please contact me or your school's ITRT or technician in order to get SQL Developer. Once you get it installed and configured, you can use any of the queries in these examples and my SQL Examples page by just copying and pasting them into SQL Developer. The queries are highlighted in light yellow.

Making the SQL Developer Connection

Before you can use SQL Developer, you must establish a connection to an Oracle database. This consists of giving the connection a name, such as PowerSchool, and specifying the database username, password, host, port, and SID (PSPRODDB for PowerSchool connections). After you enter this information, click the Test button to ensure that your connection works. If it doesn't, you have entered something incorrectly or there is a firewall policy blocking the connection.

Screenshot of SQL Developer database connection dialog.

Your First Query

After setting up your connection, enter the following text into an SQL Developer query window and click the green arrow or press Ctrl+Enter.


SELECT *
FROM students
WHERE enroll_status = 0
  AND schoolid = 680 --Change the schoolid to match your school's three-digit id number
ORDER BY grade_level,lastfirst;

What Just Happened?

If you entered the query correctly and you are connected to a PowerSchool database, SQL Developer will display all of your currently enrolled students, with all of the columns in the Students table visible. Students will be sorted by grade level, then by name. Here is an explanation of each of the SQL keywords.

SELECT column1,column2, etc. [AS column alias]
All of your queries are going to begin with the SQL SELECT keyword. Follow the SELECT keyword with a comma-separated list of the columns you wish to retrieve. Do not put a comma after the last column. You can also follow SELECT with *, which is a wildcard that retrieves all of the columns in the table. SELECT, like all SQL keywords, is case-insensitive, but you will frequently see it and other SQL keywords in all caps, to distinguish them from other parts of SQL statements, such as the names of columns, which are not case-sensitive, either.

SELECT has an optional keyword, AS, which is used to temporarily change the name of a column. The temporary name is called an alias. For example, the following SELECT statement will retrieve the student's student_number and lastfirst columns, and label lastfirst as Student_Name in the SQL Developer output grid:

SELECT student_number, lastfirst AS Student_Name

Note that the temporary column names you create with AS cannot contain spaces. It is customary to use an underscore character instead of spaces.

FROM table1,table2, etc.
The FROM keyword specifies the table or tables that are the source of the columns you choose with the SELECT statement. Separate the tables with commas. Do not place a comma after the last table. You can also use aliases for table names, which will save you keystrokes at the expense of some clarity in your queries. If you are pulling data from multiple tables that contain columns with the same name, you have to modify your SELECT statement to make it clear which table is the source for the data. For example, the PowerSchool Students and Teachers tables both have columns named lastfirst, which contain the person's last name, followed by a comma and the person's first and middle names; e.g., Doe, John Roberts. If you want to include those columns in a query that includes both the Students and Teachers tables, you have to write something like this:

SELECT students.lastfirst AS Student_Name, teachers.lastfirst AS Teacher_Name
FROM students,teachers
...

The following statement is equivalent to the previous one, except that it aliases the Students table as s and the Teachers table as t. Note that the AS keyword is not used when aliasing table names.


SELECT s.lastfirst AS Student_Name, t.lastfirst AS Teacher_Name
FROM students s,teachers t
...

Please note: the preceding two statements are not complete queries. Don't run them, because they don't specify how rows in the students table are joined to rows in the teachers table. In fact, as you will see later, the Students and Teachers tables are not directly linked.

WHERE
The WHERE keyword serves two main purposes:
  1. Filtering the rows returned by the query by requiring columns in those rows to match specific criteria;
  2. Specifying how tables in multi-table queries are joined.

In our first query example, we filtered the students table by specifying that the enroll_status column had to equal 0 (WHERE enroll_status = 0). The PowerSchool enroll_status column contains a 0 if the student is currently enrolled in our school division. It contains another numeric value if the student has withdrawn (enroll_status = 2) or graduated (enroll_status = 3) or is pre-registered (enroll_status = -1).

You can chain multiple WHERE conditions together by using the AND keyword; for example WHERE enroll_status = -1 AND schoolid = 290 will return all of the students who are pre-registered at school number 290. The AND keyword means that both criteria conditions have to be met. There is also an OR keyword, which means that only one of the chained criteria conditions has to be met. For example, WHERE last_name = 'Jones' OR last_name = 'Smith' will return all students whose last name is Jones or Smith. A shorter method of writing this last query is to use the IN keyword, like this: WHERE last_name IN ('Jones','Smith'). Note the use of a comma to separate the different criteria. Note also that columns that contain textual information, such as names, have to have criteria enclosed within single quotes. This also applies to dates. For example, if you want to find all students who enrolled on or after a particular date, use WHERE entrydate >= '08/20/2014'. The format you use for date criteria must also match the format specified in SQL Developer's Preferences. Navigate to Tools-->Preferences-->NLS-->Date Format and enter MM/DD/YYYY if you want to use dates formatted like this: 08/20/2014. SQL Developer's default date format is DD-MON-RR, which results in dates formatted like 20-AUG-14.

The WHERE keyword's second main purpose is to specify how related tables are joined. In a relational database like PowerSchool, information is stored in related tables, each of which has a primary key (PK). The PK is a column that uniquely identifies each row in the table. It is usually called the ID column and is automatically generated by PowerSchool. Note that in the student table, the ID column is not the same as the student_number column, even though the student_number column itself is unique. We can change the student_number for a student, but we can never change a student's ID column. Primary Keys are frequently used to join related tables in a multi-table query. Consider this common example: you want to write a query to list all of the students in a school by their homeroom. This requires you to join five tables, as illustrated below.

Entity relationship diagram showing five table join.

You may be wondering why the schoolid and termid fields appear in the CC table in the above illustration, when they also occur in the Sections table. So do I. I debated about whether to include them in my mirrored Access table. I can't think of any situation in which the schoolid and termid values in the CC table would differ from those in the joined Sections table. In fact, there isn't any way of generating an entry in the CC table from within the PowerSchool web interface that would result in those values being different. I decided to leave them in to make my mirrored table a more accurate reflection of the PowerSchool CC table, and to illustrate another point: commercial database applications, even pricey ones like PowerSchool, are not always normalized. Normalization is the process of organizing the fields and tables of a database to minimize redundancy. PowerSchool's developers have some reason for duplicating the schoolid and termid fields in the CC table; I just don't know what it is.

The main tables in this multi-table join are Students, Sections, Course, and Teachers. There is a many-to-many relationship between the Students and Sections table. That means that each student can be enrolled in many sections, and each section can have many students enrolled in it. PowerSchool uses the CC (Current Classes) table to resolve the many-to-many relationship. The CC table contains the primary keys from both the Students and Sections tables. When a primary key is used in a related table, it is called a foreign key in the related table. When joining related tables, it is usually a good idea to join them by having the primary key from one table be a foreign key in its related tables, but that is not absolutely necessary. Note that PowerSchool uses the Course_Number column to join the Course and Sections tables. Note also that PowerSchool is not consistent in naming foreign keys. In the CC table it uses StudentID and SectionID as the names of the foreign keys from the Students and Sections tables, but it just uses Teacher, rather than TeacherID as the name of the foreign key from the Teachers table. These are just things you have to learn when dealing with PowerSchool.

The illustration above does not accurately reflect all of the names used by PowerSchool in its Students, CC, Sections, Course, and Teachers tables. I have created some custom Microsoft Access applications for users in our school division. Due to problems we have experienced with Oracle's ODBC software–which enables a direct connection to PowerSchool tables from applications like Access and Excel when it works–I set up an MS Access database that contains a mirror of the main PowerSchool tables, and then I connect my MS Access applications to the mirrored Access database, rather than directly to PowerSchool. I use SQL Developer to export data directly from PowerSchool into CSV files, which I use to populate my mirrored Access database. The illustration above is from the mirrored Access database, not PowerSchool.

The following query is one that you could use in SQL Developer to list all students in a particular school by homeroom. It demonstrates how to use WHERE clauses to join tables together. This query also demonstrates the liberal use of comments, to explain the purpose of each clause. Comments begin with two hyphens.


--Alias students as st, sections as s, teachers as t, and courses as c
--Because students and teachers tables both have column called lastfirst
--We have to specify the table source, using tablename.column name format
SELECT st.student_number,st.lastfirst AS Student,t.lastfirst AS Teacher
FROM students st,cc,sections s,teachers t,courses c
WHERE st.id = cc.studentid
  --Preceding line joins primary key of students table to CC table's foreign key studentID.
  AND cc.sectionid = s.id
  --Preceding line joins CC table's foreign key sectionid to primary key of Sections table.
  AND s.course_number = c.course_number
  --Preceding line joins Sections table's course_number column to course_number column in Courses table.
  --This is an example where a join is accomplished without using a primary key.
  AND s.teacher = t.id
  --Preceding line joins Sections table's teacher column foreign key to the teacherid column in the Teachers table.
  AND c.course_name = 'Homeroom' --Now we are back to using WHERE clauses to filter, here by course. 
  AND s.termid = 2400 --Filter by the 2014-2015 school year, which is termid 2400.
  AND s.schoolid = 730 --Filter by schoolid. 
ORDER BY t.lastfirst,st.lastfirst; --Order by teacher, then by student.
The preceding query will run perfectly well without the comments. The comments' only purpose is to document the query for humans.
ORDER BY
ORDER BY is used to create a sort order for the rows returned by the query. Follow it with a comma-separated list of column names. For example, to order a list of students by grade, use ORDER BY grade_level. The default is to order rows in ascending order. You can use the optional DESC keyword to order the rows in descending order. For example, to retrieve a list of the students in a school beginning with the youngest, use ORDER BY dob DESC. The dob column records the student's date of birth. Ordering it in descending order will list the most recent dates of birth first.
The Concluding Semicolon
You have to put a semicolon at the end of your SQL queries. The semicolon marks the end of the query. If you leave it out, your query won't run.

One SQL Developer Gotcha

When I first began using SQL Developer, I noticed that my queries would sometimes not update when I clicked the green arrow, which executes the query statement. After troubleshooting this problem for an embarrassingly long time, I realized that the problem occurred whenever the cursor was placed before the SELECT keyword (for example, when I began a query with a comment), or after the concluding semicolon. Keep the cursor within the confines of the query statement–between SELECT and the semicolon–and you will be fine.

Where to Go From Here

If you are a PowerSchool user, see my PowerSchool SQL Examples page. I include a number of useful PowerSchool SQL queries there, along with explanations of how they work.

If you are interested in learning more about SQL, Google "SQL tutorials". The one provided by W3Schools is a good tutorial for beginners.

And if you ever get a chance to take a database development/data warehousing course, workshop, or seminar led by Jeff Pittges, don't pass up that opportunity! Dr. Pittges was my database development professor in grad school at Radford University, and I remain forever grateful to him for the knowledge and skills I learned in that class.